Toxic comments detect system

About

The detection system to predict different types of toxicity. This project uses NLP (TF-IDF) and XGBoost. Set up back-end and front-end servers via Flask, and wrapped them in Docker. Binary classification and multi-targets.

All steps you can see in main file train.ipynb.

Stack:

ML: NLP, XGBoost, TF-IDF, sklearn, pandas, numpy, matplotlib
API: Flask, Flask_wtf
VM: Docker
Data from kaggle: Jigsaw Toxic comment competition

Only one feature: comment_text (text)

Feature transform:

regex clean
tfidf

User guide

1. Prepare

1.1 Clone git:

$ git clone https://github.com/hildar/python-flask-docker.git

1.2 If you have some problem with downloading ML model yon can download it from logreg_pipeline.dill or create it from train.ipynb file and put at the folder app/models/;

1.3 Make docker image:

$ cd python-flask-docker/docker
$ docker build -t python-flask-docker app/

1.4 Run docker container:

$ docker run -d -p 8180:8180 -p 8181:8181 python-flask-docker

2. Usage

Now, there are two ways:

1-st one - Front server

Go to the address http://localhost:8180 and use front server. You can manually type some comment at the web form and enjoy the result:

2-nd one - Jupiter Notebook

Use Jupiter Notebook "Client.ipynb" and step by step check server located http://localhost:8181.

Name		Name	Last commit message	Last commit date
Latest commit History 85 Commits
.github/workflows		.github/workflows
csv		csv
docker/app		docker/app
.gitignore		.gitignore
Client.ipynb		Client.ipynb
README.md		README.md
Train.ipynb		Train.ipynb
example_front.png		example_front.png
example_probs.png		example_probs.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Toxic comments detect system

About

Stack:

User guide

1. Prepare

2. Usage

1-st one - Front server

2-nd one - Jupiter Notebook

About

Languages

hildar/toxic-comments

Folders and files

Latest commit

History

Repository files navigation

Toxic comments detect system

About

Stack:

User guide

1. Prepare

2. Usage

1-st one - Front server

2-nd one - Jupiter Notebook

About

Topics

Resources

Stars

Watchers

Forks

Languages