Skip to content

word2vec-google-news-300

Compare
Choose a tag to compare
@menshikh-iv menshikh-iv released this 09 Nov 08:50
· 28 commits to master since this release
38672be

Pre-trained vectors trained on a part of the Google News dataset (about 100 billion words). The model contain vectors for 3 million words and phrases. The phrases were obtained using a simple data-driven approach described in "Distributed Representations of Words and Phrases and their Compositionality".

Feature Description
File size 1.6GB
Number of vectors 3000000
Dimension 300

Read more:

Example

import gensim.downloader as api

model = api.load("word2vec-google-news-300")
model.most_similar(positive=["king", "woman"], negative=["man"])

"""
Output:

[(u'queen', 0.7118192911148071),
 (u'monarch', 0.6189674139022827),
 (u'princess', 0.5902431011199951),
 (u'crown_prince', 0.5499460697174072),
 (u'prince', 0.5377321243286133),
 (u'kings', 0.5236844420433044),
 (u'Queen_Consort', 0.5235945582389832),
 (u'queens', 0.518113374710083),
 (u'sultan', 0.5098593235015869),
 (u'monarchy', 0.5087411999702454)]

"""