Deep Pollster: Political Orientation Prediction

University of Cincinnati Senior Design 2020

Deep Pollster is an data analysis application that uses Deep Learning to predict the political leaning of Twitter users in a particular geographical region of the USA

Project Description

Abstract

Social Networking has risen to a place of prominence as a medium of publishing information. Times are constantly changing, and the power to sway and portray political opinions is shifting from traditional media such as newspapers and television networks to social media platforms like twitter. This has given rise to new directions of research in Computational Political Science.

In this venture we reexamine the problem of measuring and predicting the political orientation of twitter users. We expect to contribute to the study of the political blogosphere by incorporating multiple hypotheses about the behavior of the average twitter user and a registered politician, alike. Incorporating ideas such as tweets, retweets, subtweeting, followers and followees network and degrees of separation helps us understand the twitter political scenario better and helps us better understand how to leverage these sources of information. In recent times, hundreds of researchers take to twitter to analyze the effect of twitter on major political events such as the 2016 and 2020 U.S. elections, and we think that our technical contribution would be the reimagination of the traditional problem of predicting the political leaning of a given user.

By studying the political orientation of twitter users, it is possible to target advertisements at individuals, shape digital profiles, and deliver news, articles, views and products that are individualistic and personalized. This could also be used to predict the political outcome of an election by predicting the leaning of users in a geographical location.

Index Terms - Twitter, Political Science, NLP, Deep Learning, Neural Networks

Challenges

Sentiment analysis on tweets to assess political leaning has its disadvantages :

Does not paint a complete and holistic picture of the users' ideological views
Cannot build a digital profile of a user from a single or even with a temporal series of tweets
Assessing political leaning of a demography does not serve the purposes and intents of individual orientation

Solution

We leverage more than just tweet-retweet maximization, or a network matrix :

Binary classification of the latest tweets, retweets and liked tweets on the basis of political leaning
Identifying the degree of separation between the user and politicians from both the political sides

Our proposed system is curated to provide a more rounded and holistic sense of the individual user, painting an overall picture of their digital profile, leading to potential in marketing and business spheres.

Team Members

Shivchander Sudalairaj - [email protected]

Sagar Panwar - [email protected]

Faculty Advisor

Anca Ralescu - [email protected]

Architecture

Modules

Data Extraction Module

Dataset

Link to download the dataset

Model

Classification Model

Ensemble Classifier

Networking Module

Degree of Seperation (Erdos Number)

Technologies

Testing & Results

Test Plan

Test Case

The model was tested with multiple sets of test cases to eliminate any innate bias
Each sets of test cases consists of a set of 50 previously untested politicians’ twitter handles

Hypothesis : Politicians are relatively consistent with the language models that they follow while publishing tweets on twitter

Results

Result 1

After running our tests, we observed that our model was able to predict democratic politicians with 100% accuracy While the misclassifications arose with republican politicians resulting with an accuracy of 80%

Result 2

After running our tests, we observed that our model was able to identify democrats with significantly higher confidence than republicans.

We also observed that predictions from tweets model was more stable than retweets and liked tweets. This supports our initial hypothesis for variable weights

Inference

Democratic tweets consistently fall on the political left
Republican tweets falls more on political centre and centre-right
Democrats are more consistent with their language and in turn political ideologies used on twitter
Republicans use language which is more ambiguous and tend to waver between left and right of the political spectrum
Naively extrapolating this model to general public users will cause inherent bias

Future Work

The handling of tweet content analysis and classification can further be improved to handle spam accounts. It is also to be noted that sarcasm is yet to be handled by our model and additions could be made to account for this
To develop a generalized model to be applicable for general public users, we would need to survey users and find out their political orientation to develop a more general dataset and retrain the model using the dataset

User Manual

Video

Presentation

Poster

Assessments

Initial Self-Assessment

Final Self-Assessment

Summary of Hours

	Date	Category	Shiv	Sagar
1	1/20 - 1/25	Twitter API	4	10
2	1/26 - 2/1	Data Exploration	5	10
3	2/2 - 2/15	Research Exploration	5	10
4	2/16 - 2/22	Experiments with LSTM	5	3
5	2/23 - 2/29	Data Preprocessing	5	2
6	3/1 - 3/7	Model Architecture Experiments	10	5
7	3/8 - 3/14	Model Training	6	2
8	3/15 - 3/21	Model Testing	4	2
9	3/22 - 4/4	Integration	7	2
10	4/5 - 4/11	Documentation and Github	4	4
		Total Hours	55	50

Assignments

License

MIT license
Copyright 2020

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
assesments		assesments
assets		assets
assignments		assignments
bios		bios
model		model
src		src
tutorial		tutorial
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
_config.yml		_config.yml
requirements.txt		requirements.txt
usermanual.md		usermanual.md

License

shivchander/political-alignment-prediction

Folders and files

Latest commit

History

Repository files navigation

Deep Pollster: Political Orientation Prediction

Table of contents

Project Description

Abstract

Challenges

Solution

Team Members

Faculty Advisor

Architecture

Modules

Data Extraction Module

Dataset

Model

Classification Model

Ensemble Classifier

Networking Module

Technologies

Testing & Results

Test Plan

Test Case

Results

Result 1

Result 2

Inference

Future Work

User Manual

Video

Presentation

Poster

Assessments

Initial Self-Assessment

Final Self-Assessment

Summary of Hours

Assignments

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages