-
py-poppleract Public
Python library and Web service based on Poppler Pdftotext utility and Tesseract OCR for extracting text from PDF documents
-
py-web-miner Public
Extensible Web Miner to extract information from web pages. It is based on HTTP Requests library, Beautiful Soup parser, and Selenium WebDriver.
-
knowledge-graph-pruning-2022 Public
Experiments on Knowledge Graph Embeddings models for link prediction and deletion
-
gen-text-compr-aggl-clust-sum Public template
A library for topic modeling based on the algorithm: Generative Text Compression with Agglomerative Clustering Summarization (GTCACS)
-
-