This framework allows users who tweet in Turkish to get an estimation of their Big Five OCEAN Personality scores via their tweets.
As of October 2022, this project is inactive and no longer maintained.
For academic background of this framework, please refer to Clustering based Personality Prediction on Turkish Tweets by Tutaysalgir, E., Karagoz, P. and Toroslu, I.H., 2019, August.
Please refer to INSTALL.md
- Python 3.8
- Java 8
- MySQL or MariaDB
For a list of required Python libraries, refer to requirements.txt.
This framework uses 4 services to run properly. Namely,
- Zemberek service
- Word2Vec service
- Backend REST service via Flask
- Frontend service via React
Zemberek provides the NLP functionalities for the framework. The communication between the framework and Zemberek is achieved through gRPC.
To run Zemberek service, you can use run_zemberek.sh
.
Word2Vec service provides an API to get Word2Vec representations of words. You can use run_word2vec.sh
to run the
service.
The backend of this project relies on a RESTful API to allow any frontend or application to make requests and get responses from it. To achieve this, we use Flask, which has a very simple but powerful interface.
This service is by nature multithreaded, and it spawns a new thread for every new vector calculation request to handle multiple users at a time. If you expect a huge number of users at the same time, it might be useful to use autoscaling solutions.
The frontend is purely for aesthetics, and it is extremely under-developed. As long as making the proper REST requests, any frontend should work just fine.
Even though this framework is designed as a web tool, it is possible to use this through a terminal window as well. In order to do so, follow these steps after installation:
- Run Zemberek and Word2Vec services.
- Fill in
auth_pair
inmain.py
with your credentials. - Run
python main.py username
if you want to give username as an argument, orpython main.py
to enter username when asked. - You will receive the predicted OCEAN score in the terminal.
It is possible to download tweets of a user and use this csv file for the vector construction. In order to do so, do the following steps:
- Fill in
auth_pair
indownload-tweets.py
with your credentials. - Download tweets of a user via
python download-tweets.py
. This will save the tweets in thedata/tweets
folder. - Run the vector constructor via
python main.py username --file
arguments. This will read the tweets from the csv file instead of downloading them from scratch.
This method was implemented to avoid burning out the Twitter API for recurrent downloads.
Made in Ankara with 💙