Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Really the Cuid2 algorithm? #7

Open
sdavids opened this issue May 18, 2024 · 4 comments
Open

Really the Cuid2 algorithm? #7

sdavids opened this issue May 18, 2024 · 4 comments
Labels
help wanted Extra attention is needed question Further information is requested

Comments

@sdavids
Copy link

sdavids commented May 18, 2024

Describe the bug

I am unsure if this implementation is the same algorithm as the original.

To Reproduce

The entropy is computed differently:

https://github.com/paralleldrive/cuid2/blob/53e246b0919c8123e492e6b6bbab41fe66f4b462/src/index.js#L7-L14

private static String createEntropy(final int length) {
int primeNumber;
final StringBuilder stringBuilder = new StringBuilder(length);
while (stringBuilder.length() < length) {
primeNumber = PRIME_NUMBER_ARRAY[Math.abs(Common.nextIntValue()) % PRIME_NUMBER_ARRAY.length];
stringBuilder.append(Integer.toString(primeNumber * Common.nextIntValue(), NUMBER_BASE));
}
return stringBuilder.toString();


The hash is computed differently:

https://github.com/paralleldrive/cuid2/blob/53e246b0919c8123e492e6b6bbab41fe66f4b462/src/index.js#L31-L35

private static String computeHash(final String content, final int saltLength) {
final String salt = createEntropy(saltLength);
try {
return new BigInteger(MessageDigest.getInstance("SHA3-256").digest((content + salt).getBytes(StandardCharsets.UTF_8)))
.toString(NUMBER_BASE);
} catch (final NoSuchAlgorithmException exception) {
throw new CUIDGenerationException(exception);
}
}


The alphabet is accessed differently:

https://github.com/paralleldrive/cuid2/blob/53e246b0919c8123e492e6b6bbab41fe66f4b462/src/index.js#L37-L39

final char firstLetter = CUIDv2.ALPHABET_ARRAY[Math.abs(Common.nextIntValue()) % CUIDv2.ALPHABET_ARRAY.length];

https://github.com/ai/nanoid?tab=readme-ov-file

random % alphabet is a popular mistake [...] [t]he distribution will not be even; there will be a lower chance for some symbols to appear compared to others.

@sdavids
Copy link
Author

sdavids commented May 18, 2024

@ericelliott Maybe you could provide an authoritative assessment?

@sdavids
Copy link
Author

sdavids commented May 18, 2024

I opened this issue because I considered doing a 1:1 algorithm translation in a new repo but I think it would be better to update this project instead of creating a new one.

@thibaultmeyer thibaultmeyer added help wanted Extra attention is needed question Further information is requested labels Jun 19, 2024
@ericelliott
Copy link

ericelliott commented Jun 21, 2024

Correct that % is not a safe method to generate uniform random values. Are the histogram and randogram tests run in this repo?

@ArnieGA
Copy link

ArnieGA commented Feb 5, 2025

Any updates on the issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed question Further information is requested
Projects
None yet
Development

No branches or pull requests

4 participants