This repository contains the code for the paper "Interpretable medical image Visual Question Answering via multi-modal relationship graph learning". (paper link)
Regarding the dataset please refer to Medical-CXR-VQA.
The feature extraction code is provided in the feature extraction
folder. Please refer to the README.md in the folder for more details.
- Feature extraction. Follow README.md to prepare the prerequisite data, and extract features.
After running, you will obtain the following files:
- mimic_shape_full.pkl
- mimic_shapeid_full.pkl
- dicom2id.pkl
The files above are stored in ./data
folder
- cmb_bbox_features_full.hdf5
will be stored in ./data/medical_cxr_vqa
folder.
- Data preprocessing.
working directory: ./
python data_preprocessing.py
The files below will be generated in the ./data/medical_cxr_vqa
folder.
- mimic_dataset_train.pkl
- mimic_dataset_val.pkl
- mimic_dataset_test.pkl
-
Download the model checkpoints from here. Unzip the file in the ./saved_models folder.
-
Run the model.
python main.py --relation_type <relation_type> --checkpoint <path_to_checkpoint_file> --mimic_cxr_png <path_to_png_folder> --mode test
- relation_type: implicit, spatial, or semantic
- path_to_checkpoint_file: the path to the checkpoint file
- path_to_png_folder: the path to the mimic-cxr-png folder
- mode: 'train' or 'test'. If you want to train from scratch, use 'train', and remove the --checkpoint argument.
After running, the prediction preds.pkl will be generated in the ./output/medical_cxr_vqa
folder.
- Compute the evaluation metrics.
python tools/get_combine_score.py
- please change the
mimic_cxr_png
argument accordingly.
This command will compute the metrics for preds_.pkl files in the ./output/medical_cxr_vqa
folder. The computed results should be very close to the results in the paper (error ~0.0001). We also provide the preds.pkl files we used in the paper for reference at here.