GitHub - CBIIT/NCI-DOE-Collab-Pilot3-Multitask-Convolutional-Neural-Network: MT-CNN is a CNN for Natural Language Processing and Information Extraction from free-form texts. BSEC group designed the model for information extraction from cancer pathology reports.

Multitask Convolutional Neural Network (MT-CNN)

Description

MT-CNN is a CNN for natural language processing (NLP) and information extraction from free-form texts. This model extracts information from cancer pathology reports.

User Community

Data scientists interested in classifying free form texts (such as pathology reports, clinical trials, abstracts, and so on).

Usability

Data scientists can train the provided untrained model on their own data, or use the trained model to classify the provided test samples. The provided scripts use pathology reports that has been downloaded from the Genomics Data Commons (GDC), converted to text format, cleaned, and preprocessed. Here is an example report.

Uniqueness

Classification of unstructured text is a classical problem in natural language processing. The community has developed state-of-the-art models like BERT, Bio-BERT, and Transformer. This model has the advantage of working on a relatively long report (that is, over 400 words) and shows robustness in terms of accuracy and speed with relatively small number of unstructured pathology reports.

Components

The following components are in the Model and Data Clearinghouse (MoDaC):

The ML Ready Pathology Reports dataset contains the original data used for training, validation, and testing.
The MultiTask Convolutional Neural Network (MT-CNN) dataset contains the trained model weights and topology to be used in inference.

Technical Details

Refer to this README.

Author

Biomedical Sciences, Engineering, and Computing (BSEC) Group; Computer Sciences and Engineering Division; Oak Ridge National Laboratory

Name		Name	Last commit message	Last commit date
Latest commit History 62 Commits
data		data
data_utils		data_utils
LICENSE		LICENSE
README-technical.md		README-technical.md
README.md		README.md
environment.yml		environment.yml
keras_mt_shared_cnn.py		keras_mt_shared_cnn.py
mt_cnn_exp.py		mt_cnn_exp.py
mt_cnn_infer.py		mt_cnn_infer.py
predictions.py		predictions.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multitask Convolutional Neural Network (MT-CNN)

Description

User Community

Usability

Uniqueness

Components

Technical Details

Author

About

Releases

Packages

Contributors 7

Languages

License

CBIIT/NCI-DOE-Collab-Pilot3-Multitask-Convolutional-Neural-Network

Folders and files

Latest commit

History

Repository files navigation

Multitask Convolutional Neural Network (MT-CNN)

Description

User Community

Usability

Uniqueness

Components

Technical Details

Author

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 7

Languages

Packages