Search sentences from abstracts of articles in ArXiv dataset base if the keyword matches the subject of the sentence. Highlight the subjects containing the keywords.
Made my Haozhe Si, 01/10/2021
Use SpaCy model to perform denpendency parsing. Don't have to install if not modifying myindex
. The model is not 100% accurate and may cause some issues in searching.
pip install spacy
python -m spacy download en_core_web_md
Use Whoosh module to build the searching engine.
pip install whoosh
Use Flask module to build the web interface. The module will open a local host at http://127.0.0.1:5000/.
pip install flask
Download core.zip
, unpack and run
python arxiv_web.py
By default, the corpus size is 10,000. Can modify the size of corpus in archivesearch.ipynb
.