KG-LLaVA: Reproducibility for AAAI Paper

Overview

This repository provides the official implementation of "LLaVA Needs More Knowledge: Retrieval Augmented Natural Language Generation with Knowledge Graph for Explaining Thoracic Pathologies" (AAAI 2025). KG-LLaVA integrates Knowledge Graph-based Retrieval-Augmented Generation (KG-RAG) with Vision-Language Models (VLMs) to generate Natural Language Explanations (NLEs) for medical imaging.

Installation

To set up the environment, run:

git clone https://github.com/yourusername/AAAI-Reproducibility.git
cd AAAI-Reproducibility
pip install -r requirements.txt

To set up the MedCLIP environment separately:

pip install git+https://github.com/RyanWangZf/MedCLIP.git
pip install faiss-gpu

Dataset

MIMIC-NLE

To obtain the MIMIC-NLE dataset, follow the instructions at:

MIMIC-NLE Repository
Download MIMIC-CXR reports: PhysioNet

We use the training dataset following the official MIMIC-NLE split.

RadGraph Triplet Extraction

To generate Knowledge Graph (KG) triplets from medical reports, we utilize RadGraph. Follow the instructions below:

Download and set up RadGraph from: Stanford-AIMI RadGraph
Run the inference script to extract triplets:

python dataset_preparation.py

We filter triplets to retain only those with suggestive_of relationships.

Datastore Retrieval

A datastore is built using MedCLIP and FAISS to facilitate knowledge retrieval. The datastore includes:

kg_nle_index
kg_nle_index_captions.json

To retrieve triplets for test images, use:

python datastore_retrieval.py

Training

To prepare the dataset for LLaVA training, execute the following:

python dataset_preparation.py

This ensures the dataset is in the required format:

[
    {
        "id": "0",
        "split": "train",
        "image": "p11/p11941242/s50000014/dffc8ab2-ff37704f-2fb29e6d-51e08075-88bca914.jpg",
        "conversations": [
            {
                "from": "human",
                "value": "<image>\nThe image-specific triplets from the knowledge graph are: opacities suggestive_of effusions; TRIPLETS HERE BASED ON KG;. And for the given image, Which signs show that the patient has uncertain Atelectasis, positive Lung Opacity, uncertain Pleural Effusion, uncertain Pneumonia?"
            },
            {
                "from": "gpt",
                "value": "Retrocardiac opacity with silhouetting of the left hemidiaphragm and lateral border of the descending aorta is nonspecific and could reflect any of a combination of atelectasis, focal pneumonia or even a small effusion."
            }
        ]
    }
]

To train LLaVA with KG triplets, run:

bash KG-LLaVA/models/LLaVA/scripts/v1_5/finetune_task_lora.sh

Evaluation

For model evaluation, use:

bash KG-LLaVA/models/LLaVA/scripts/v1_5/eval/vqav2.sh

Acknowledgements

This repository makes use of the following open-source projects:

LLaVA
MedCLIP
RadGraph
MIMIC-NLE
SmallCap

Citation

This work will appear in AAAI-2025 soon. If you use this work, please cite:

@article{hamza2024llava,
  title={LLaVA Needs More Knowledge: Retrieval Augmented Natural Language Generation with Knowledge Graph for Explaining Thoracic Pathologies},
  author={Hamza, Ameer and Ahn, Yong Hyun and Lee, Sungyoung and Kim, Seong Tae and others},
  journal={arXiv preprint arXiv:2410.04749},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
models/LLaVA		models/LLaVA
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
dataset_prepartion.ipynb		dataset_prepartion.ipynb
datastore_retrieval.py		datastore_retrieval.py
environment-llava.yml		environment-llava.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

KG-LLaVA: Reproducibility for AAAI Paper

Overview

Table of Contents

Installation

Dataset

MIMIC-NLE

RadGraph Triplet Extraction

Datastore Retrieval

Training

Evaluation

Acknowledgements

Citation

About

Releases

Packages

Languages

License

ailab-kyunghee/KG-LLaVA

Folders and files

Latest commit

History

Repository files navigation

KG-LLaVA: Reproducibility for AAAI Paper

Overview

Table of Contents

Installation

Dataset

MIMIC-NLE

RadGraph Triplet Extraction

Datastore Retrieval

Training

Evaluation

Acknowledgements

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages