Skip to content

Latest commit

 

History

History
59 lines (40 loc) · 2.38 KB

README.md

File metadata and controls

59 lines (40 loc) · 2.38 KB

MMRGL

This repository contains the code for the paper "Interpretable medical image Visual Question Answering via multi-modal relationship graph learning". (paper link)

Regarding the dataset please refer to Medical-CXR-VQA.

Feature extraction

The feature extraction code is provided in the feature extraction folder. Please refer to the README.md in the folder for more details.

Data Preparation

  1. Feature extraction. Follow README.md to prepare the prerequisite data, and extract features.

After running, you will obtain the following files:

  • mimic_shape_full.pkl
  • mimic_shapeid_full.pkl
  • dicom2id.pkl

The files above are stored in ./data folder

  • cmb_bbox_features_full.hdf5

will be stored in ./data/medical_cxr_vqa folder.

  1. Data preprocessing.

working directory: ./

python data_preprocessing.py

The files below will be generated in the ./data/medical_cxr_vqa folder.

  • mimic_dataset_train.pkl
  • mimic_dataset_val.pkl
  • mimic_dataset_test.pkl

Model

  1. Download the model checkpoints from here. Unzip the file in the ./saved_models folder.

  2. Run the model.

python main.py --relation_type <relation_type> --checkpoint <path_to_checkpoint_file> --mimic_cxr_png <path_to_png_folder> --mode test
  • relation_type: implicit, spatial, or semantic
  • path_to_checkpoint_file: the path to the checkpoint file
  • path_to_png_folder: the path to the mimic-cxr-png folder
  • mode: 'train' or 'test'. If you want to train from scratch, use 'train', and remove the --checkpoint argument.

After running, the prediction preds.pkl will be generated in the ./output/medical_cxr_vqa folder.

  1. Compute the evaluation metrics.
python tools/get_combine_score.py
  • please change the mimic_cxr_png argument accordingly.

This command will compute the metrics for preds_.pkl files in the ./output/medical_cxr_vqa folder. The computed results should be very close to the results in the paper (error ~0.0001). We also provide the preds.pkl files we used in the paper for reference at here.