Skip to content

PHP + +pgSQL w/ pgvector webapp to take linear combinations of words using word2vec embeddings

License

Notifications You must be signed in to change notification settings

jacksonwalters/wordchef

Repository files navigation

WORDCHEF

PHP + pgSQL + pgvector webapp to take linear combinations of words

  • Uses spaCy to generate full vocab and corresponding wordvectors
  • Wordvectors are stored in a PostgreSQL database with pgvector which allows fast semantic search
  • Given two words, look up their wordcvectors and take the average
  • Find the nearest five words to the averaged vector

TOPIC MODELING

Available at https://wordchef.app/topic_modeling

Built from https://github.com/jacksonwalters/nlp/topic_modeling

  • User uploads comments .csv and keywords .csv
  • Load pre-trained gensim doc2vec model
  • Embed comments as docs to get vector embeddings
  • Use k-means to cluster comments into groups
  • Compute TF-IDF scores across clusters
  • Embed keywords/bigrams and compute nearest bigram to cluster centroid
  • Find nearest keyword/bigram to each comment

About

PHP + +pgSQL w/ pgvector webapp to take linear combinations of words using word2vec embeddings

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published