Trains a classifier (bag of words approach) to classify adult content. Needs two data sources:
- Jsonl with data filtered because it was adult content
- Jsonl with normal data
Both should be language specific.
Install with poetry
poetry install
And run with click
poetry run train --help