Twitter-Sentiment-Analysis-using-Spark-Streaming-and-Kafka

Description:

Developed a spark streaming application that will continuously read data from twitter about a topic.

The tweets obtained are analyzed for sentiment using NLP and sent to a Kafka topic.

Sentiments from the Kafka topic were forwarded to Elasticsearch using Logstash as pipeline.

Visualized the sentiment of tweets from the Elasticsearch data using Kibana.

Instructions to run:

It needs 7 arguments to be passed. 4 arguments for twitter to generate oAuth credentials Arguments to be passed: consumer key, consumer secret key, access token, secret access token, twitter topic, kafka topic, checkpoint directory path

Run zookeeper, kafka server and create a topic. Start a consumer to that will dump messages to standard output

Run Elasticsearch, kibana and logstash to visulaize data in real time.

To run the project, run sparkStreaming.scala class on IntelliJ or create an assembly fat jar file and run the jar file with the below command:

spark-submit --packages org.apache.spark:spark-sql-kafka-0-10_2.11:2.4.0

--class ClassName(sparkStreaming) PathToJarFile -- 6 arguments

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
README.md		README.md
Sentiment.scala		Sentiment.scala
Vizualization_1.png		Vizualization_1.png
Vizualization_2.png		Vizualization_2.png
build.sbt		build.sbt
sparkStreaming.scala		sparkStreaming.scala

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Twitter-Sentiment-Analysis-using-Spark-Streaming-and-Kafka

About

Releases

Packages

Languages

asifsmohammed/Twitter-Sentiment-Analysis-using-Spark-Streaming-and-Kafka

Folders and files

Latest commit

History

Repository files navigation

Twitter-Sentiment-Analysis-using-Spark-Streaming-and-Kafka

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages