Group project assignment for Text Analysis and Retrieval course held at
University of Zagreb, Faculty of Electrical Engineering and Computing.
Amid a global crisis caused by COVID-19 pandemic, we decided to tackle COVID-19 Open Research Dataset Challenge (CORD-19) and make our humble contribution by creating a search engine for COVID-related information. A total of six versions of search engines are made and evaluated.
Each model assigns a vector representation to every paragraph in the corpus sorts paragpraph by cosine similarity with generated incoming query vector representation. All our code is available in this repo, you can check our System description paper for more details.
You can see our model in action in this notebook and the final version of our paper in this link.