fake_news_topic_detection

To allow the classifier to be better in classification (fake news detection and Topical Domain Classification):

Combining columns "headlne" and "content" from dataset and exclude row without content if these exist
data cleaning ( removing email address, hyperlinks, numbers, special characters and duplicate)
After testing Decision Tree, Random Forest and Multinomial Naive Bayes (with parameter such as unigram, bigram), we decide to use Multinomial Naive Bayes algorithm (using Scikit-learn library to classify text cleaned) based on its result

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
README.md		README.md
properties.json		properties.json
task3.py		task3.py

Provide feedback