-
Notifications
You must be signed in to change notification settings - Fork 1
Outline
Radhu Ladani edited this page Jun 30, 2021
·
18 revisions
- This is Radhu Ladani, a Post-Graduate Student from the Bharati Vidyapeeth University(Rajiv Gandhi Institute of Information Technology and Biotechnology ) of Pune.
- Working in CEVopen was a remarkable experience for me.
- The CEVOpen team is committed to create a single entry point for searching Open scientific literature and Phytochemistry is a key element of this project.
- My project is focused on
Phytochemical ontologies for analyzing the literature on essential oils
. This includes developing a dictionary and a mini corpus, as well as testing search tools with a dictionary against the mini corpus.- we created a corpus of 500 papers on
medicinal activity and essential oils
by utilizing the getpapers toolkit, which is a web scraper for open-source scientific literature. - We created a dictionary of
Medicinal activity
andEssential oil plants
. These dictionaries are ontologies that identify words and phrases within the documents. we tapped into the potential of wikidata by converting it to dictionaries via a SPARQL query and then used the ami dict tool to convert it. On a closer look, we can see that these dictionaries contain title, multiple important terms, description, wikidata ID, Wikipedia URL, and synonyms. a significant feature of these dictionaries is their multilingualism. This is how dictionaries derive multidisciplinary knowledge. - To get the possible insights from the open scientific literature regarding the association between the various essential oil plants and compounds with their medicinal activity, we used our own search engine
ami
which searches and analyses the terms in the project repository and displays the term's frequency and histogram.
- we created a corpus of 500 papers on
- The co-occurrence of ami search result revealed some interesting insights such as:
-
Antioxidant
,Antimicrobial
,Antifungal
, andAnti-inflammatory
properties are common. -
Carvacrol
andthymol
are two essential oil compounds that are frequently mentioned and Essential oil plants likeRosmarinus Officinalis
(Rosemary),Origanum vulgare
(oregano),Thymus vulgaris
(thyme)andOcimum basilicum
(Tulsi) are commonly present in Open scientific literature. - Countries like
China
andIndia
are frequently mentioned in papers about medicinal activity and essential oils.
-
- The project's future direction will be on text mining of open scientific literature in order to create a multilingual semantic atlas of volatile phytochemistry. This will include text categorization, clustering, entity extraction, document summarization, and sentiment analysis using various machine learning techniques.
- We used GitHub (https://github.com/petermr) as a storage portal because we believe in the OpenNotebook philosophy, where all work is done as OpenNotebookScience, where all activities are completely transparent and in real-time.
- I'm extremely grateful to NIPGR for providing me this opportunity. I would like to express my sincere gratitude to Dr. Gitanjali Yadav and Dr. Peter Murray Rust for their valuable guidance and life-impacting mentoring. This endeavor would not have been possible without the constant and unconditional support of all my co-interns.
- Thank you