Repository for Scent Mining: Extracting Olfactory Events, Smell Sources and Qualities
paper.
The code and the model used to extract olfactory information from texts can be found here: https://github.com/Odeuropa/wp3-information-extraction-system
@inproceedings{menini-etal-2023-scent,
title = "Scent Mining: Extracting Olfactory Events, Smell Sources and Qualities",
author = "Menini, Stefano and
Paccosi, Teresa and
Tekiro{\u{g}}lu, Serra Sinem and
Tonelli, Sara",
booktitle = "Proceedings of the 7th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature",
month = may,
year = "2023",
address = "Dubrovnik, Croatia",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2023.latechclfl-1.15",
pages = "135--140",
abstract = "Olfaction is a rather understudied sense compared to the other senses. In NLP, however, there have been recent attempts to develop taxonomies and benchmarks specifically designed to capture smell-related information. In this work, we further extend this research line by presenting a supervised system for olfactory information extraction in English. We cast this problem as a token classification task and build a system that identifies smell words, smell sources and qualities. The classifier is then applied to a set of English historical corpora, covering different domains and written in a time period between the 15th and the 20th Century. A qualitative analysis of the extracted data shows that they can be used to infer interesting information about smelly items such as tea and tobacco from a diachronical perspective, supporting historical investigation with corpus-based evidence.",
}
This work has been realised in the context of Odeuropa, a research project that has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No. 101004469.