-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Baillie-PSW vs. Deterministic Miller-Rabin #2
Comments
The idea behind hashing is to take the relatively few composites that pass a strong fermat test to some base, partition them into sets and then search for another base that eliminates all of them. The hashing function is selected to distribute them roughly evenly with minimal computation. (In my software I use a single multiplication and shift). The best fermat hashtables run in about 2x the time of a single fermat test, or even slightly less since a montgomery transform of 2 can be performed faster than any other base. From a cursory reading of this library it doesn't seem to be using an optimal algorithm (many intervals call 3 bases, despite the fact that a lucas_v test is already faster ), although the way the arithmetic is performed may be able to compensate for it. If you are familiar with Rust, you can read the source code of f-analysis' to_hashtable function, for a "straightforward" implementation. See also the more complex corrector_set function which can "correct" a hashtable to work with only a single base over a small interval. It does this by running over the residue classes of the multiplicative hash instead of just the known pseudoprimes to a select base. (This was used to enable machine-prime to use only 1 fermat base for n < 2^47). Note: I have no affiliation with this project and have just discovered it, so I can't give any information about it. |
Thanks for chiming in! I think the thing I had a hard time understanding was what exactly was being hashed. I found this website which linked to this paper, and now I finally understand:
There's clearly a lot more to it than that (doing a little trial division, designing the hash function, finding all the bases, etc.), but at least I finally understand the basic idea. |
Trial division produces very little benefit, since the majority of pseudoprimes are semiprimes with relatively large factors, and the ones with small factors will be eliminated by the majority of bases. The primary issue to look out for is ensuring that primes aren't mapped to a bucket that contains a base that has the prime as a factor. (Because the fermat test requires that gcd(a,p) = 1 and clearly gcd(b*p,p) != 1 . The FJ64_16K table that the authors provide contains such an error (691 if I remember correctly). Another approach is to directly use the hashed value as the base rather than the index, but of course this is only practical in a triple-base test, and doesn't seem to be any better than 2,15 or 2,9375 followed by a hashtable (which can both be implemented with half the memory of a triple-base test that F&J provide). And triple-base tests are worse than a base-2 fermat test followed by a Lucas-V test in both memory and time complexity, so that avenue is basically a dead-end anyway. |
Hello,
I'm a neophyte to fast integer factoring and primality testing, but I've recently been reading up on the various techniques.
I can see that this library uses deterministic Miller-Rabin. The README links to this Wikipedia page section, which currently states:
I've recently implemented the Baillie-PSW myself, including the strong version of the Lucas test. Nicely's Baillie-PSW page states that the cost is roughly equivalent to 3 to 7 iterations of the Miller-Rabin test.
However, from cursory browsing of your project's website, it's clear that you've put great effort into optimizing your particular deterministic Milller-Rabin implementation. For example: you're doing something with hashing to choose the bases, but I don't really understand the idea behind this step. Nevertheless, I'd put more stock in your implementation than this paragraph from Wikipedia, even though you link directly to it in your README.
Could you please comment on how you see the performance of the best deterministic Miller-Rabin implementations relative to Baillie-PSW?
And, could you give a brief summary (or provide a link to a tutorial) on the idea behind the base hashing?
The text was updated successfully, but these errors were encountered: