-
-
Notifications
You must be signed in to change notification settings - Fork 30.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-99108: Import SHA2-224 and SHA2-256 from HACL* #99109
Conversation
Most changes to Python require a NEWS entry. Please add it using the blurb_it web app or the blurb command-line tool. |
Most changes to Python require a NEWS entry. Please add it using the blurb_it web app or the blurb command-line tool. |
Most changes to Python require a NEWS entry. Please add it using the blurb_it web app or the blurb command-line tool. |
Most changes to Python require a NEWS entry. Please add it using the blurb_it web app or the blurb command-line tool. |
Most changes to Python require a NEWS entry. Please add it using the blurb_it web app or the blurb command-line tool. |
Most changes to Python require a NEWS entry. Please add it using the blurb_it web app or the blurb command-line tool. |
Most changes to Python require a NEWS entry. Please add it using the blurb_it web app or the blurb command-line tool. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd be useful for this kind of thing to include a script (somewhere under Tools?.. where to put it can be figured out later) that can generate and update the hacl-star vendored code implementation. As I assume we'll not want this to be a one time code drop but periodically update it, at least for every major Python release, as hacl-star evolves.
Please include a benchmark comparing to the OpenSSL based sha2 as well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add all new headers to MODULE__SHA256_DEPS
in Makefile.pre.in
.
I would prefer if you keep the sha256module.c
file in its current place and only replace libtom code with HACL calls. It's easier to review and keeps the history for untouched lines.
Adding to the numbers in the PR description #99109 (comment) we get the following. Note that OpenSSL is compiled without assembly support (
|
LOL, thanks, but fair enough. 😄 This does at least answer my broader question about the feasibility of also getting rid of the OpenSSL backed |
Which seems to just be a copyright message update.
We do this with other third party code as well, it avoids linking and dynamic linking ODR violations when extension module or embedding applications also use their own version of the library code in question. + Minor annotation on the test, use `python -m test -u cpu` + Add a link to the readme.
Okay, I think this is ready. I pushed a few changes to the PR branch:
The list confined to its own .h file was manually generated and thus needs manual updating, but I doubt the list will change often. I'd like a test to confirm that no |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One small style nit. Also, there's a cast warning on line 127 in Modules/sha256module.c.
Autoconf changes looks good to me.
Co-authored-by: Erlend E. Aasland <[email protected]>
Thanks both @gpshead and @erlend-aasland! @erlend-aasland what warning are you seeing? Is it a cast from python's size_t to uint32_t? I wasn't able to reproduce (neither on gcc-11/x86_64/Linux, nor {clang,gcc-12}/x86_64/OSX), but will happily add a cast if that's what you're seeing. |
Yeah, it's a cast warning. You'll see it in the diff tab on the PR; it's one of the GitHub windows runners that's complaining, IIRC (I'm on mobile now). |
I fixed the final review comments... @erlend-aasland do I need to re-request a review from you, or is there anything I can do? thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is in good shape! But on its own - lets call this step "2a" - it'll be a little odd to ship in 3.12 if we don't also have some natural related followups to include as well:
- 2b) SHA384 & SHA512 support (sha512module.c and related hacl code).
- 3) SHA3 support (more generated hacl code).
I'll go ahead and merge this, and be responsible for reverting it if we decide we don't wind up just the shorter half of SHA2 using it if the follow-ons aren't ready yet before 3.12-beta1 in May.
Could you go ahead and prepare follow-on PRs?
Yup! I'll respond on the issue so that we don't scatter the overall discussion across individual PRs. |
Issue #99108 is about replacing hashlib primitives (for the non-OpenSSL case) with verified implementations from HACL*. This is the first PR in the series, and focuses specifically on SHA2-256 and SHA2-224.
This PR imports Hacl_Streaming_SHA2 into the Python tree. This is the HACL* implementation of SHA2, which combines a core implementation of SHA2 along with a layer of buffer management that allows updating the digest with any number of bytes. This supersedes the previous implementation in the tree.
@franziskuskiefer was kind enough to benchmark the changes: in addition to being verified (thus providing significant safety and security improvements), this implementation also provides a sizeable performance boost!
The changes in this PR are as follows:
Modules/_hacl
To review the changes, one should focus on
sha256module.c
, and possiblyHacl_Streaming_SHA2.h
which is the API entry point. The rest of the files are part of HACL* and are cherry-picked straight from the HACL* repository, so as to minimize the amount of hand-edits and facilitate future refreshes of the code, should HACL* land some performance improvements.The original code comes from https://github.com/project-everest/hacl-star/tree/master/dist/gcc-compatible. The various .h files in
include/
,krmllib/
, etc. will be shared across (hopefully) future primitives imported from HACL*. I can trim those files to minimize the amount of includes, but then this will make refreshing the code in the future harder (since the trimming will have to be redone for each refresh).The code was originally authored by @karthikbhargavan and I; @polubelova and @R1kM provided a considerable amount of help over the past week, as we were overhauling our implementation to offer a significant performance boost (we want this first PR to be a good one!). Thanks to everyone involved in this team effort.
There are a couple remarks left in the source code, but for now, I think it's better to open up a PR and start discussing. As promised, tagging @alex for this PR. Thanks so much!