Skip to content

Commit

Permalink
add wikipedia
Browse files Browse the repository at this point in the history
  • Loading branch information
menshikh-iv committed Nov 10, 2017
1 parent ad161d5 commit 0a5437a
Showing 1 changed file with 10 additions and 0 deletions.
10 changes: 10 additions & 0 deletions list.json
Original file line number Diff line number Diff line change
@@ -1,5 +1,15 @@
{
"corpora": {
"wiki-en": {
"description": "Extracted Wikipedia dump from October 2017. Produced by `python -m gensim.scripts.segment_wiki -f enwiki-20171001-pages-articles.xml.bz2 -o wiki-en.gz`",
"checksum-0": "a7d7d7fd41ea7e2d7fa32ec1bb640d71",
"checksum-1": "b2683e3356ffbca3b6c2dca6e9801f9f",
"checksum-2": "c5cde2a9ae77b3c4ebce804f6df542c2",
"checksum-3": "00b71144ed5e3aeeb885de84f7452b81",
"file_name": "wiki-en.gz",
"source": "https://dumps.wikimedia.org/enwiki/20171001/",
"parts": 4
},
"text8": {
"description": "Cleaned small sample from wikipedia",
"checksum": "68799af40b6bda07dfa47a32612e5364",
Expand Down

0 comments on commit 0a5437a

Please sign in to comment.