Skip to content

Latest commit

 

History

History
33 lines (25 loc) · 2.44 KB

File metadata and controls

33 lines (25 loc) · 2.44 KB

Machine Learning Clustering and Retrieval: Text and Image Clustering Models

Description

  • Built wikipedia article and image retrieval models by using clustering algorithms such as k nearest neighbors, k means, latent dirichlet allocation, and hierarchical clustering.
  • Used expectation maximization, locality sensitive hashing, and gibbs sampling to built gaussian mixture and mixed membership models for an improved assignment of data-points and clustering.

Code

  1. Nearest Neighbors Search
  2. 1 Nearest Neighbor with Locality Sensitive Hashing
  3. K Means
  4. Expectation Maximization
  5. Expectation Maximization - Image Data (Gaussian Mixtures)
  6. Latent Dirichlet Allocation - Mixed Membership Model
  7. Hierarchical Clustering

Programming Language

Python

Packages

Anaconda, Graphlab Create Installation guide

Tools/IDE

Jupyter notebook (IPython)

How to use it

  1. Fork this repository to have your own copy
  2. Clone your copy on your local system
  3. Install necessary packages

Note

This repository does not contain optimal machine learning models! It only assesses various models that can be built using different machine learning algorithms (either implemented or used directly from Graphlab Create package) to perform different tasks.