Feature extraction

This folder provides the code for extracting the features needed for our method using Faster R-CNN.

Prerequisite data

Prior to extraction, the following files need to be prepared:

MIMIC-CXR-JPG converted into 1024x1024 PNG images. These must be saved to the mimic-cxr-png folder. Run mimic_jpg2png() in converter.py.

python converter.py -p <input_path_to_mimic_cxr_jpg> -o <output_path_to_mimic_cxr_png>

After running this, you will obtain three files:

mimic_shape_full.pkl: contains the shape of images in the dataset.
mimic_shapeid_full.pkl: contains the shape index of image in mimic_shape_full.pkl.
dicom2id.pkl: contains the mapping between dicom id and the feature index.

Faster-rcnn checkpoints. Make sure these are located in the checkpoints folder.
- checkpoints/model_final_for_anatomy_gold.pth (Download link. It is used for anatomical structure detection and can be obtained by running train_anatomy.py)
- checkpoints/model_final_for_vindr.pth (Download link. It is used for disease detection and can be obtained by running train-vindr-online.py)
Dictionary files. Make sure these are in the dictionary folder.
- dictionary/category_ana.pkl (An anatomical structure category set)
- dictionary/GT_counting_adj.pkl (A co-occurrence matrix of findings in mimic-cxr-jpg)
- dictionary/mimic_ans2label_full.pkl (A dictionary that maps the answer to the label in MIMIC-CXR-JPG)
(Optional) The GT_counting_adj.pkl in step 3 can be generated by run

python dictionary/preparation.py -p <path_to_mimic_cxr_jpg>

Extraction

Working directory: ./feature_extraction

1, Anatomical structure feature extraction

python ana_bbox_generator.py

2, Disease feature extraction

The disease feature are extracted using the trained disease detection model, on the anatomical structure bounding boxes extracted in the previous step.

python bbox_gen_by_coords.py

3, Feature Combination

python combine_datasets.py

cmb_bbox_features_full.hdf5 will be generated in the data/medical_cxr_vqa folder.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Feature extraction

Prerequisite data

Extraction

1, Anatomical structure feature extraction

2, Disease feature extraction

3, Feature Combination

Files

README.md

Latest commit

History

README.md

File metadata and controls

Feature extraction

Prerequisite data

Extraction

1, Anatomical structure feature extraction

2, Disease feature extraction

3, Feature Combination