-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add extremely common word sequences? #63
Comments
Hey, thanks for the suggestion.
This would mean you need to use the dictionary matcher to identify all those different words and then you need to use some kind of sequence matcher to go through all those matches to check if they are in a row. I like the general idea of this but i don't see the solution right now. If you have an idea feel free to open a PR or create your own package, since |
Youre welcome. This is what immediately comes to mind. I haven't given it any deep thought so there may be dragons.
One idea that comes to mind, is if these words exist in an order less dictionary, they can be moved out into an array of sequence arrays. (As your example)
(Then we aren't loading duplicate words into the browser and keeping the bundle small)
Then the regular word matcher can be tweaked to look for words in the seuqnece arrays just like it looks for words on the order less array currently.
Then once thats done... Its a simple matter of writing an algorithm to look for words in the sequence arrays within the password. If a match is found, check the following word or previous word, if that matches check the next word (while loop) then remove that string from the password and mark it as score 1 or zero or whatever. Then repeat the process with whatever's left of the password.
It seems relatively trivial to add this functionality but of course time is precious and would take a little bit of time. Currently I have no free time. Just contributing the idea at this point.
No pressure.
Thanks for the library :)
…On 22 July 2021 3:19:11 PM SAST, MrWook ***@***.***> wrote:
Hey, thanks for the suggestion.
I like the idea, the problem is that this would be a combination from
the dictionary matcher and the sequence matcher.
Basically you need a dictionary for every language that has those
sequences. For example like this:
```
{
"numbers": [
"one",
"two",
"three"
...
],
"seasons": [
"spring",
"summer",
"autumn",
"winter"
]
...
}
```
This would mean you need to use the dictionary matcher to identify all
those different words and then you need to use some kind of sequence
matcher to go through all those matches to check if they are in a row.
I like the general idea of this but i don't see the solution right now.
If you have an idea feel free to open a PR or create your own package,
since `1.0.0-beta-1` custom matchers are possible but i think it would
be easier to add it to the repo to reuse the dictionary and add a
custom DictionarySequence matcher.
--
You are receiving this because you authored the thread.
Reply to this email directly or view it on GitHub:
#63 (comment)
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.
|
I had a similar, broader idea: Since Dropbox kicked off this project, there have been some public leaks of unhashed password lists that should be game-changing data sources for a project like this. Instead of assuming that passwords use common words in the same frequency as written text ("you, to, it, that, ..."), we can rank them based on their actual usage in passwords. Based on actual leaked password lists, we can improve entropy scoring based on (1) the popularity of the password structure (set of patterns; e.g. (word)(number)(symbol) > (symbol)(word)(symbol)) and (2) the rank/weight of each particular pattern within those sets (e.g. onetwothreefour > correcthorsebatterystaple). That first exercise – determining the entropy of the password structure itself – was waived by the original project due to lack of data. Of course, this exercise is the same as improving the efficiency of a password cracker. But that was essentially the point of zxcvbn to begin with – to help password strength meters "catch up" to password cracking libraries. (I understand that this fork is focused on cleanup, tech debt, and other higher priority things :) Hopefully it is flattering and not annoying that the suggestions are coming here now.) |
@modest this fork isn't just a clean up. I wanted to revive the project and the idea behind it because i think those password policies are plain up stupid. |
@modest and @MrWook, I did a little bit of thinking on this today, and I agree keeping those as separate matchers (or at least different match passes) seems like the right way to go. I'd be interested in implementing this in Nbvcxz as well if there seems to be a consensus in how the algorithm should work, and appropriate scoring values. |
TLDR:
123456
is pretty much the most common password in the world and also has no entropy due to being an obvious sequence.zxcvbn-ts falls on it's face with
onetwothreefourfivesix
, rating it as maximum strength.Let's fix that?
Just an idea, not sure if this is commonly done with passwords.
But just like 123456789 or 987654321 or abcdefg, etc is seen as completely lacking entropy... what about
Months
januaryfebruarymarch
julyjunemay
Written numbers
onetwothree
nineeightseven
Seasons
springsummerautumn
winterspringsummer
Bible chapters
genesisexoduswhatever etc
Sizes
smallmediumlarge
largemediumsmall
Greek whatever
alphabeta etc
Phonetic alphabet
alphabravocharliedelta
tangosierraromeo
zxcvbn-ts currently thinks all this sort of junk is a strong password (might need to add an extra word in some cases, but normally 3-4 words, and it thinks you're golden), when you've basically got no entropy if you're using any of the above.
Obviously there's an endless amount of common sequences people could put into a password.
Like listing the characters of a popular tv series.
But I figured the categories I wrote above should be standard, because regardless of a person's preferences or personality, they'll deal with (or be familiar with) most, if not all of the above. With the exception of maybe awareness of the bible chapter names.
The text was updated successfully, but these errors were encountered: