Part 1: Monday, July 8, 2024 14:00 – 18:00 EDT
Part 2: Tuesday, July 9, 2024 14:00 – 18:00 EDT
Large Language Models (LLMs) like ChatGPT have exhibited remarkable capabilities in understanding and generating language across diverse disciplines. In the realm of biomedical data science and computational biology, LLMs can significantly aid the processes of information accessibility, data analysis, and knowledge discovery. In this tutorial, we offer an introductory level hands-on guide to understanding and utilizing these LLMs in the field of biomedical data science. Our tutorial begins with leveling the learning ground by providing introductions to LLMs and Biomedical Data Science. Subsequently, we delve into the core applications of LLMs in biomedical data science/computational biology via retrieval-augmented generation, database functionalities, and code generation. To facilitate thought-provoking discussions, pertinent case studies will be discussed, emphasizing how to harness the power of LLMs to bridge the gap between technical feasibility and practical utility in biomedical data science. Furthermore, hands-on exercises are included to enable participants to apply their learning in real-time. Participants will also get acquainted with OpenAI's ChatGPT and open-source LLMs, as well as their design, use cases, limitations, and prospects.
View on ISMB 2024: https://www.iscb.org/ismb2024/programme-schedule/tutorials#vt1
Tutorial registration: https://www.iscb.org/ismb2024/register#tutorials
Room link: https://iscb.junolive.co/ISMB24/live/breakouts/ismb2024_tutorialvt1-1 and https://iscb.junolive.co/ISMB24/live/breakouts/ismb2024_tutorialvt1-2
Robert Xiangru Tang
Yale University |
Qiao Jin
NCBI/NLM/NIH |
Hufeng Zhou
Harvard University |
Shubo Tian
NCBI/NLM/NIH |
Zhiyong Lu
NCBI/NLM/NIH |
Mark Gerstein
Yale University |
Time in EDT (Montreal, Quebec, Canada Local Time)
Time | Section | Presenter |
---|---|---|
Part 1 (Monday, July 8, 2024) | ||
14:00 - 14:10 | Overview and Welcome [Slides] | Robert Tang |
14:10 - 14:40 | Introduction to LLMs with a Focus on Biomedical Data Science [Slides] | Shubo Tian |
14:40 - 15:10 | How to Use GPT-3.5 and GPT-4 with Python [Slides] | Qiao Jin |
15:10 - 15:30 | How to Use Open-source LLMs with Python [Slides] | Robert Tang |
15:30 - 15:45 | Coffee Break | |
15:45 - 16:10 | Code Generation in Bioinformatics [Slides] | Robert Tang |
16:10 - 16:35 | Retrieval-Augmented Generation with Large Language Models [Slides] | Qiao Jin |
16:35 - 17:00 | Querying PubMed with RAG to Answer Biomedical Questions with GPT-4 [Slides] | Qiao Jin |
Part 2 (Tuesday, July 9, 2024) | ||
14:00 - 14:45 | Large Language Models for Biomedicine: from PubMed Search to Gene Set Analysis [Slides] | Zhiyong Lu |
14:45 - 15:30 | Developing Computational Representations of Disease-Relevant Molecules: 3 Cases Studies for AI in Biomedicine [Slides] | Mark Gerstein |
15:30 - 15:45 | Coffee Break | |
15:45 - 16:10 | Integrating Biomedical Data Database Development with LLMs [Slides] | Hufeng Zhou |
16:10 - 16:35 | Database Query Generation with LLMs [Slides] | Hufeng Zhou |
16:35 | Closing Remarks | Robert Tang |
*Participants should join our tutorial via Junolive. |