Subreddit Classifier

We used a publically available dataset of the top 1000 posts from the 50 largest subreddits, and trained a model to classify reddit post titles and (string) bodies into these fifty subreddits. This dataset is included in our model in archive/ for training purposes.

Additionally, this repository includes a frontend to interact with the model that displays the probability estimates for the most likely classifications along with a warning if the model is likely unsure in its prediction.

Setup

Run each cell in the Subreddits.ipynb notebook in order, culminating in a trained model and a running Flask server.
With the Flask server in the final notebook cell running, serve the frontend at frontend/index.html

Note that our Flask backend is currently hardcoded to localhost:3000, which may need to be changed if that port is in use.

Open-Source Code

Our project is built off of several open-source libraries, namely:

flask and flask_cors (Web app framework to serve a model prediction API)
nltk (Natural language toolkit we used for extracting words from posts)
numpy
pandas
re (Python standard library regex operations)
scikit-learn

Additionally, our model is based off of initial code from a scikit-learn tutorial on text classification. We used this as a starting point for how to set up our model, but we preproccessed the data ourselves, performed our own parameter search on parameters we felt were useful, and created the Flask server and frontend ourselves.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
archive		archive
frontend		frontend
README.md		README.md
Subreddits.ipynb		Subreddits.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Subreddit Classifier

Setup

Open-Source Code

About

Releases

Packages

Languages

jmmclaug201/subreddit-classifer

Folders and files

Latest commit

History

Repository files navigation

Subreddit Classifier

Setup

Open-Source Code

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages