-
Notifications
You must be signed in to change notification settings - Fork 6
PyThaiNLP Corpus Downloader
Wannaphong Phatthiyaphaibun edited this page Jan 10, 2021
·
1 revision
code name: pythainlp-data
Because PyThaiNLP is developing very quickly Including a data warehouse that changes with every new release. Therefore, there was a need to update and save a new system to replace the old system called pythainlp-data
.
Development
- We used a TinyDB database for local catalog. (User)
- We used a json file for store. The available corpus names can be seen in this file: pythainlp.github.io/pythainlp-corpus/db.json
- We used a GitHub releases for store a corpus/model
- By default, downloaded corpus and model will be saved in
$HOME/pythainlp-data/
(e.g./Users/bact/pythainlp-data/wiki_lm_lstm.pth
).
You can view a corpus at pythainlp.github.io/pythainlp-corpus/