Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor blacklist/whitelist code #5

Open
jlnr opened this issue Jul 18, 2016 · 4 comments
Open

Refactor blacklist/whitelist code #5

jlnr opened this issue Jul 18, 2016 · 4 comments

Comments

@jlnr
Copy link
Member

jlnr commented Jul 18, 2016

There are lots of regular expressions in the Database scripts that decide which Wikipedia pages are about movies, and what's crap along the lines of "List of Italian splatter porn actors of the 90s".

These blacklists and whitelists should probably be moved to configuration files or constants, for easier editing.

@jlnr
Copy link
Member Author

jlnr commented Jul 18, 2016

The blacklist also wrongly filters movies such as "My Sex Life... or How I Got into an Argument" (not porn, apparently).

@jlnr
Copy link
Member Author

jlnr commented Jul 18, 2016

And on the other hand, there's still at least one porn actor in the list: "Barrett Long (Pornodarsteller)", when actors shouldn't be included altogether.

@jlnr
Copy link
Member Author

jlnr commented Jul 18, 2016

And there are still entries that end in 小說 (novel) in Japanese (except my Kanji is slightly wrong, so I can't be bothered to find and delete these now).

@jlnr
Copy link
Member Author

jlnr commented Sep 24, 2016

Idea: Instead of just printing "Censoring…" to stdout in my rake tasks, also collect these sites in CSV files. That'd make it much easier to see the effects of blacklist/whitelist changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant