Capstone_mutimodal: DASC7600 Capstone project

This is the pytroch implementation for the paper:

RadFusion: Benchmarking Performance and Fairness for Multimodal Pulmonary Embolism Detection from CT and EHR. Paper in arXiv.

Outline

Pulmonary Embolism auto detection from CT data and Electronic Health Records.

data: the CT images and EHR data should be put under this directory.
dataset: custom dataset/dataloader class (inherited from Dataset/Dataloader).
models: PENet (a CNN model) and Fusion model.(A vision transformer model is going to be uploaded).

Usage

Environmemt Setup

The pacakges required are list below and the version is the lastest.

Pytorch
Numpy
Pandas
sklearn
scipy

For Windows User

The path in train.sh and test.sh have to be windows-style.
The path in read_pkl.py,./scripts/create_pe_hdf5_update.py and generate_ehr.py have to be windows-style.
Go to datasets/ct_pe_dataset_3d.py and find _load_volume() to see the comment in that function.

Data Preparation

CT data and EHR records can be downloaded here

If you choose to use only part of the CT data,

put the .npy files into the directory data/raw
put the .csv files into data/
Generate .pkl file for the part of data we choose: modify the list part_of_study in read_pkl.py: fill the list with 'idx' of the data you choose and run python read_pkl.py, then a file named series_list.pkl will appear in data/processed
Generate hdf5 file for the part of data we choose: run python ./scripts/create_pe_hdf5_update.py to generate data.hdf5 file under the directory data/processed (
Generate combined EHR record for the part of data we choose: modify the list part_of_study in generate_ehr.py:fill the list with 'idx' of the data you choose and run python generate_ehr.py, then a file named part_of_ehr.csv will appear in data/processed

Just to check there are three files in data/processed after doing the steps above:

series_list.pkl
data.hdf5
part_of_ehr.csv

Pretrained model

Our model has two parts: PENet and Elasticnet. Download the best checkpoint of PENet and put it into ./data/ckpt.

Train and test

To train the fusion model, run sh train.sh. After the training is finished, the trained model is stored at ./train_logs. To test the model, modify the ckpt_path in test.sh and run sh test.sh

If you choose to use all the CT data,

Just put the corresponding .hdf5, series_list.pkl and part_of_ehr.csv into data/processed.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
args		args
cams		cams
ct		ct
data		data
data_loader		data_loader
datasets		datasets
evaluator		evaluator
img		img
intermountain		intermountain
logger		logger
logs		logs
models		models
results		results
saver		saver
scripts		scripts
train_logs		train_logs
util		util
.DS_Store		.DS_Store
.gitattributes		.gitattributes
README.md		README.md
generate_ehr.py		generate_ehr.py
get_cams.py		get_cams.py
get_cams.sh		get_cams.sh
read_pkl.py		read_pkl.py
run_test_intermountain.sh		run_test_intermountain.sh
series_list.pkl		series_list.pkl
test.py		test.py
test.sh		test.sh
test_from_dicom.py		test_from_dicom.py
test_from_dicom.sh		test_from_dicom.sh
test_fusion.py		test_fusion.py
train.py		train.py
train.sh		train.sh
train_fusion.py		train_fusion.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Capstone_mutimodal: DASC7600 Capstone project

Outline

Usage

Environmemt Setup

For Windows User

Data Preparation

Pretrained model

Train and test

About

Releases

Packages

Languages

zhenfenxiao/Capstone_mutimodal

Folders and files

Latest commit

History

Repository files navigation

Capstone_mutimodal: DASC7600 Capstone project

Outline

Usage

Environmemt Setup

For Windows User

Data Preparation

Pretrained model

Train and test

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages