-
Notifications
You must be signed in to change notification settings - Fork 0
Naive Bayes Classifier for article headlines which annotates a headline as either 'clickbait' or 'news' (non-clickbait)
License
annabfenske/clickbait_classifier
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Anna Fenske (af2570) NLP Final Project CLICKBAIT CLASSIFIER Files: features.py: feature extractor train.py: collect featuresets of training data, write them to training_data.json test.py: train classifier and test on test_clickbait.json and test_news.json TO RUN: python test.py [news OR clickbait OR all] testing_data\test_clickbait.json: corpus of headlines from Buzzfeed (not in training corpus) for testing results\output_clickbait.txt: output from test.py on test_clickbait.json testing_data\test_news.json: corpus of headlines from New York times (not in training corpus) for testing results\output_news.txt: output from test.py on test_news.json testing_data\test_all.json: test data from both test_clickbait.json and test_news.json results\output_all.txt: output from test.py on test_all.json training_data.json: Headlines from all sources and their feature sets and annotations for training classifier NOTE: training_data.json not included in this repository (file size too large). This should not be an issue though since test.py runs train.py. training_data: directory holding annotated training data separated by source
About
Naive Bayes Classifier for article headlines which annotates a headline as either 'clickbait' or 'news' (non-clickbait)
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published