Project is conducted on tweets on autonoumous cars with sentiment labels (1 to 5, negative to postive scale)
Data is collected from http://www.crowdflower.com/data-for-everyone which contain these reviews about the self-driving cars. This dataset contains 7,156 observations and 9 variables and Kaggle
- Sentiment analysis using Supervised ML models, Compared models i.e logistic regression, KNN, SVM, Decision Tree,Naive Bayes and Random Forest
- Semi supervised (Doc2vec modeling for understanding context in senstiment analysis)- code in this repository
- Topic modeling, is a clustering technique to identify top topic from the tweets