A Bidirectional Gated Recurrent Neural Network and Capsule Network Based Approach for Identification of Relevant Literature Regarding Protein Interactions and Mutations for Precision Medicine
According to the Precision Medicine Initiative, precision medicine is an emerging approach for disease treatment and prevention that takes into account individual variability in genes, environment, and lifestyle for each person. In this paper, we describe our approach to the Document Triage Task of the BioCreative VI Precision Medicine Track 4, which focused on identifying relevant PubMed citations describing genetic mutations affecting protein-protein interactions by taking advantage of text mining algorithms. We propose a Bidirectional Gated Recurrent Neural Network (GRU) and a Capsule Network based model for the task. In order to convert the PubMed corpus into the corresponding vector representations and feed them into the neural network, we use a pre-trained word2vec model to get word representations. When using the evaluation script provided by the organization, our system achieves a precision of 0.6295, recall of 0.7386 and F-score of 0.6797.