Skip to content

Word2Vec implementation using the Game of Thrones data-set.

License

Notifications You must be signed in to change notification settings

eyalzk/throne2vec

Repository files navigation

Throne2Vec

Training a word2vec model on a data-set containing the entire Game of Thrones book collection

This notebook is based on assignment 5 of the Udacity Deep-Learning course.

Besides the data-set, what is new here:

  • Text Pre-Processing
  • Finding word analogies using the learned embedding
  • More detailed comments
  • Optimizations

This is a Jupyter notebook so explanations are included as markdowns in the notebook. Feel free to play around with it and share comments if you have any.

The GOT corpus file is not included in this repository due to book copyrights considerations, sorry about that. However, you can create your own data-set with whichever book (or text in general) you'd like. Just make sure it is in a .zip file with one or more .txt files in it.

Dependencies:

About

Word2Vec implementation using the Game of Thrones data-set.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published