OpenMEDLab is an open-source platform to share medical foundation models in multi-modalities, e.g., medical imaging, medical NLP, bioinformatics, protein, etc. It targets promoting novel approaches to long-tail problems in medicine, and meanwhile, it seeks solutions to achieve lower cost, higher efficiency, and better generalizability in training medical AI models. The new learning paradigm of adapting foundation models to downstream applications makes it possible to develop innovative solutions for cross-domain and cross-modality diagnostic tasks efficiently. OpenMEDLab is distinguished by several features:
- World's first open-source platform for medical foundation models.
- 10+ medical data modalities targeting a variety of clinical and research problems.
- Pioneering works of the new learning paradigm using foundation models, including pre-trained models, code, and data.
- Releasing multiple sets of medical data for pre-training and downstream applications.
- Collaboration with top medical institutes and facilities.
- OpenMEDLab: An Open-source Platform for Multi-modality Foundation Models in Medicine. ArXiv'2024 [Paper]
- On the Challenges and Perspectives of Foundation Models for Medical Image Analysis. Medical Image Analysis [Paper]
- MedFMC: A Real-world Dataset and Benchmark For Foundation Model Adaptation in Medical Image Classification. Scientific Data [Code]
- A Large-scale Synthetic Pathological Dataset for Deep Learning-enabled Segmentation of Breast Cancer. Scientific Data [Code]
- D-LMBmap: a fully automated deep-learning pipeline for whole-brain profiling of neural circuitry. Nature Methods [Code]
- A Foundation Model for Generalizable Disease Detection from Retinal Images. Nature [Code]
In OpenMEDLab, we open-source a bundle of medical foundation models and their applications in various medical data modalities, ranging from medical image analysis and medical large language models to protein engineering, as shown in the diagram above.
Image from "S. Zhang and D. Metaxas. On the Challenges and Perspectives of Foundation Models for Medical Image Analysis. Medical Image Analysis"
- The Medical Large Language Model: PULSE.
- The 3D CT Segmentation Foundation Model: MIS-FM.
- The 2D and 3D Medical Segmentation Foundation Model using SAM: SAM-Med2D, SAM-Med3D.
- The Foundation Model for Retinal Image: RETFound.
- The Foundation Model for Whole-brain Axon Segmentation and Circuitry Profiling: D-LMBmap.
- The Foundation Model for Endoscopy Video Analysis: Endo-FM.
- The Survey on Data-Centric Foundation Models in Computational Healthcare: Data-Centric-FM-Healthcare.
More foundation models for medical images could be found here.
In OpenMEDLab, we also open-source a bundle of medical datasets for corresponding research of foundation models and their applications in various medical data modalities, ranging from CT, MR, pathology datasets and so on.
- MedFM Dataset: Real-world Dataset and Benchmark For Foundation Model Adaptation in Medical Image Classification.
- SA-Med2D-20M Dataset: Segment Anything in 2D Medical Imaging with 20 Million masks.
- SNOW Dataset: A Large-scale Synthetic Pathological Dataset for Deep Learning-enabled Segmentation of Breast Cancer.
- Endo-FM Private Dataset: A Large-scale Endoscopic Video Dataset with over 33K Video Clips.
🔥🔥🔥 The collection of public medical dataset is continuously updating: Awesome-Medical-Dataset.
In clinical application and research area, there are always strong needs to evaluate model performance.
- MedBench: An Open Evaluation Platform for Chinese Medical Large Language Models.
- OmniMedVQA: A New Large-Scale Comprehensive Evaluation Benchmark for Medical LVLM.
- A-Eval: A Benchmark for Cross-Dataset Evaluation of Abdominal Multi-Organ Segmentation.
- ELO: The widely used Elo Rating tournament evaluation method for calculating the relative skill levels of LLMs.
With the development of Artificial General Intelligence (AGI), deep learning methods are utilized in many other fields with image, text and other information.
- ProSST: A Pre-trained Protein Sequence and Structure Transformer with Disentangled Attention.
- ProtSSN: Fusion of Protein Sequence and Structural Information, using Denoising Pre-training Network for Protein Engineering (zero-shot).
- Swin-UMamba: Mamba-based UNet with ImageNet-based pretraining.
- Osteoarthritis-Benchmark: Benchmark for Assessing Large Language Models on Knowledge and Decision-Making Capacity in Osteoarthritis Treatment.
Project Leader: Shaoting Zhang, Xiaosong Wang
Key Contributors:
Shanghai AI Laboratory: Junjun He
Guangzhou Laboratory: Yixue Li
Zhejiang Laboratory: Wentao Zhu
Shanghai Jiao Tong University: Dequan Wang, Xiaofan Zhang, Liang Hong
Fudan University: Yi Guo
University of Electronic Science and Technology of China: Guotai Wang
East China University Of Science And Technology: Tong Ruan
Beijing University of Posts and Telecommunications: Qicheng Lao
Chinese University of Hong Kong: Qi Dou
University of British Columbia: Xiaoxiao Li
University College London: Yukun Zhou
Xi'an Jiaotong University: Zhongyu Li
Rutgers University: Dimitris Metaxas
West China Hospital: Kang Li
Xinhua Hospital: Xin Sun
Ruijin Hospital: Lifeng Zhu
The First Affiliated Hospital of Zhengzhou University: Jie Zhao
Please forward queries to [email protected]