Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Large number of probably not-genuine users #2064

Open
Jongmassey opened this issue Sep 23, 2024 · 3 comments
Open

Large number of probably not-genuine users #2064

Jongmassey opened this issue Sep 23, 2024 · 3 comments

Comments

@Jongmassey
Copy link
Contributor

2661 Users in total

Many of these from looking at the Django admin page for opencodelists appear to be not genuine users ( slack thread ).

I'm not sure there's a robust method to identify these accounts that fail the "looks a bit dodgy" sniff test, but here are some possible heuristics:

E.g. 89 are from .ru domains, 1132 have spammy phrases embedded in the username part of the email address split by many . characters

>>> len([u for u in User.objects.all() if u.email.count('.')>4])
1132

2081 have no codelists associated with them

>>> handleusers = set([h.user for h in Handle.objects.all()])
>>> versionauthors = set([v.author for v in CodelistVersion.objects.all()])
>>> anyauthor = handleusers.union(versionauthors)
>>> len(anyauthor)
580
@Jongmassey
Copy link
Contributor Author

If these users haven't created any codelists, then whatever (presumably) spam bot signed up for the accounts hasn't found a way to use the site to do their spamming.

Therefore, is the only downside to these accounts presence a) filling our database with rubbish b) overestimation of our user count.

If so, do we need to do anything about this?

If we do, what do we do?

  • choose a/some heuristic(s) and
    • delete based on that
    • send an email along the lines of "reply within 30 days if you're human" and delete non-responders

Do we need to prevent future creation of these accounts by adding email verification/a CAPTCHA/something else to user account creation page?

@StevenMaude
Copy link
Contributor

StevenMaude commented Oct 1, 2024

Do we need to prevent future creation of these accounts by adding email verification/a CAPTCHA/something else to user account creation page?

Could something like django-honeypot also be worth trying, to see if it reduces the problem?

Or doing something similar to the signup page manually — adding a non-visible form field — without the extra dependency?

Many automated bots are fairly greedy and fill in any form field, even if not visible to a real user, which can be used to filter out those bots from real users.

@lucyb
Copy link
Contributor

lucyb commented Oct 3, 2024

I'm going to take this off the board for now and suggest we do it as part of our Onboarding OpenCodelists initiative.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants