Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Emscripten branch to arc4random #1061

Closed
wants to merge 32 commits into from

Conversation

MoustaphaSaad
Copy link
Contributor

@MoustaphaSaad MoustaphaSaad commented May 28, 2024

The Emscripten platform already supports the necessary Linux API. The only missing component was adding a specific branch for Emscripten and including the existing Linux file.

This update enables all LibreSSL libraries to be built using the Emscripten toolchain.


Emscripten Build Guide

The only difference in the build process is that you need to provide the Emscripten toolchain file in the CMake configuration step. This can be done as follows:

mkdir build
cd build
cmake .. -DCMAKE_TOOLCHAIN_FILE=$EMSDK/upstream/emscripten/cmake/Modules/Platform/Emscripten.cmake -DENABLE_ASM=OFF -DLIBRESSL_TESTS=OFF
make

Please note that not all test cases build successfully, which is why the LIBRESSL_TESTS=OFF flag is used. I have a preliminary draft for building and running the tests, achieving an initial 110/129 passing rate. Let me know if you would like another PR for that.


Update

The tests are now fixed, so you can remove the -DLIBRESSL_TESTS=OFF flag from the guide above. You can run the tests using the usual make test command.


Update regarding ASAN build

Note: When enabling ASAN with Emscripten, the ssl_get_shared_ciphers test fails with the following error:

READ of size 4 at 0x124b53b4 thread T0
    #0 0x1a95f5 in strlcpy ../../../../../../../emsdk/emscripten/system/lib/libc/musl/src/string/strlcpy.c:24:33
    #1 0x1a93e7 in strlcat ../../../../../../../emsdk/emscripten/system/lib/libc/musl/src/string/strlcat.c:8:13
    #2 0x4f83 in main ../../ssl/ssl_lib.c:1735:7
    #3 0x800017da in Module._main /home/runner/work/portable/portable/build/tests/ssl_get_shared_ciphers.js:6106:102

This issue does not occur with regular Linux ASAN builds. It seems to be related to Emscripten's implementation of strlcpy. I found a similar issue reported here, which was supposedly fixed 4 years ago. I'm not sure why this problem persists.

The Emscripten platform already supports the
necessary Linux API. The only missing piece
was adding a branch specifically for Emscripten
and including the existing Linux file.

This update enables all LibreSSL libraries to
be built using the Emscripten toolchain.
@botovq
Copy link
Contributor

botovq commented May 29, 2024 via email

@MoustaphaSaad
Copy link
Contributor Author

The first issue I encountered was related to test execution. In WebAssembly, you need to pass the generated a.out.js to node to run it, as it's not an executable by itself. To address this, I introduced a function like the following:

function(add_platform_test TEST_NAME)
    if (EMSCRIPTEN)
        add_test(NAME ${TEST_NAME} COMMAND node ${ARGN})
    else()
        add_test(NAME ${TEST_NAME} COMMAND ${ARGN})
    endif()
endfunction()

This adjustment resolves most of the issues, and as mentioned, the majority of the test cases pass out of the box.

The remaining failures are mainly due to the lack of direct file system access in WebAssembly. These can be mitigated by using --preload-file link flags, as shown below:

set_target_properties(aeadtest PROPERTIES LINK_FLAGS
    "--preload-file ${TEST_SOURCE_DIR}/aeadtests.txt@${TEST_SOURCE_DIR}/aeadtests.txt"
)

The third category of failures stems from missing feature support in Emscripten, such as the lack of sigsuspend support (used in the explicit_bzero test) and fork support (used in the arc4randomforktest). I generally disable these tests for Emscripten.

I have not yet investigated or fixed all the failures, with approximately 10 tests still failing. If these changes and fixes are acceptable, I can continue working on the remaining test cases and submit them either in a separate PR or within this same PR.

I think the effort to fix the remaining issues isn't big; it can be investigated and fixed in a day or two. @botovq Please let me know your preference.

@botovq
Copy link
Contributor

botovq commented May 30, 2024

function(add_platform_test TEST_NAME)
    if (EMSCRIPTEN)
        add_test(NAME ${TEST_NAME} COMMAND node ${ARGN})
    else()
        add_test(NAME ${TEST_NAME} COMMAND ${ARGN})
    endif()
endfunction()

Makes sense, I'd be happy to use that abstraction.

set_target_properties(aeadtest PROPERTIES LINK_FLAGS
    "--preload-file ${TEST_SOURCE_DIR}/aeadtests.txt@${TEST_SOURCE_DIR}/aeadtests.txt"
)

This seems sensible as well.

I think disabling the explicit_bzero() and arc4random() tests in a first pass makes sense as well.

I am not familiar with emscripten but if it is as little effort as you make it out to be, I'd be happy to add a few patches and abstractions for preliminary support of it. Surely there are some possibilities to test this in CI?

Instead of landing it piecemeal, I think it would be preferable to add CI for emscripten to this PR and add the patches you have on top. Then we can see how far away we are and can worry about how to merge it once we get close.

In WebAssembly, you need to pass the generated a.out.js to node to run it, as it's not an executable by itself.
changed the aeadtest.sh to search to .js file as well.
added preload-file link flag to enable access to TEST_SOURCE_DIR
added preload-file link flag to enable access to TEST_SOURCE_DIR
bn_unit uses more memory than Emscripten default initial heap size.
ALLOW_MEMORY_GROWTH linker option is passed to fix the OOM error
@MoustaphaSaad MoustaphaSaad marked this pull request as draft May 31, 2024 15:48
added preload-file link flag to enable access to TEST_SOURCE_DIR
added preload-file link flag to enable access to TEST_SOURCE_DIR
added preload-file link flag to enable access to TEST_SOURCE_DIR
added preload-file link flag to enable access to TEST_SOURCE_DIR
changed the pq_test.sh to search to .js file as well
changed the quictest.sh to search to .js file as well.
added preload-file link flag to enable access to TEST_SOURCE_DIR
changed the servertest.sh to search to .js file as well.
added preload-file link flag to enable access to TEST_SOURCE_DIR
changed the shutdowntest.sh to search to .js file as well.
added preload-file link flag to enable access to TEST_SOURCE_DIR
added preload-file link flag to enable access to TEST_SOURCE_DIR
changed the ssltest.sh to search to .js file as well.
added preload-file link flag to enable access to TEST_SOURCE_DIR
testdsa doesn't have an executable instead it uses openssl executable which access various files for IO. adding such files to --preload-file is infeasible.
testenc doesn't have an executable instead it uses openssl executable which access various files for IO. adding such files to --preload-file is infeasible.
testrsa doesn't have an executable instead it uses openssl executable which access various files for IO. adding such files to --preload-file is infeasible.
tlstest uses socketpair syscall which is not supported by Emscripten
@MoustaphaSaad MoustaphaSaad marked this pull request as ready for review May 31, 2024 18:30
@MoustaphaSaad
Copy link
Contributor Author

Hello @botovq,

As agreed, I've updated the test cases and fixed nearly all of them except for the following:

  • explicit_bzero: This relies on sigsuspend, which is not supported in Emscripten.
  • arc4randomforktest: This relies on fork, which is also not supported in Emscripten.
  • testdsa, testenc, and testrsa: These don't have CMake targets (executables) and instead use OpenSSL, which accesses many input and output files. This makes adding a --preload-file flag infeasible, so I decided to disable them.

Other than these five, all remaining test cases (125 in total) pass.

I've also written an initial CI script to test Emscripten builds, but I haven't tested it because I can't run or trigger GitHub Actions CI.

I've structured my edits into atomic commits, with each commit fixing a single test case. This should make reviewing the entire PR easier by going through the series of commits.

Looking forward to your feedback.

.github/workflows/emscripten.yml Outdated Show resolved Hide resolved
.github/workflows/emscripten.yml Outdated Show resolved Hide resolved
.github/workflows/emscripten.yml Show resolved Hide resolved
tests/ssltest.sh Show resolved Hide resolved
Copy link
Contributor

@botovq botovq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering if the setting of the link flags warrants factoring into a helper, maybe add_platform_test_with_fs_access() or similar..

Generally I would be happy about as few if(EMSCRIPTEN) stanzas as possible.

tests/CMakeLists.txt Show resolved Hide resolved
tests/CMakeLists.txt Outdated Show resolved Hide resolved
@MoustaphaSaad
Copy link
Contributor Author

I was wondering if the setting of the link flags warrants factoring into a helper, maybe add_platform_test_with_fs_access() or similar..

Generally I would be happy about as few if(EMSCRIPTEN) stanzas as possible.

I was considering that as well. I also agree that it would be simpler with fewer if(EMSCRIPTEN) statements.

@MoustaphaSaad
Copy link
Contributor Author

@botovq
The issue with moving the --preload-file linking flag to the add_platform_test function is that some tests are invoked through a .sh script. In these cases, we don't call add_platform_test; instead, we use add_test for the reasons explained in this comment. Therefore, for these tests, we'll need to keep the --preload-file linker flag. Given that, I suggest we keep things as they are now, with tests that require file system access having the --preload-file flags passed individually.

@botovq
Copy link
Contributor

botovq commented Jun 7, 2024

I see. Thanks for the additional explanations. If it's not easily feasible, that's fine with me. I don't think we'll add many more types of tests anytime soon, so I should be able to navigate these cases when adding new ones.

As far as I'm concerned, I'm fine with merging this with whatever tweaks @busterb and @joshuasing would still want to see on top. I think it's best to deal with it in tree and see how it goes.

If it doesn't get in the way too badly (I don't currently see how it would) until the release (october), I think we should ship it as experimental for a cycle or two.

fix formatting in some commands. Add missing -fsanitize=address to the ASAN test
@MoustaphaSaad
Copy link
Contributor Author

@botovq
As I was adding the missing -fsanitize=address flags as requested in this comment, some of the tests started failing due to an Out Of Memory (OOM) error. According to the Emscripten website, it's recommended to use -sALLOW_MEMORY_GROWTH when address sanitizer is used, as it requires more memory than the default 16 MB.

To simplify the handling of Emscripten linking flags, I propose adding a prepare_emscripten_test_target function to include the --preload-file and -sALLOW_MEMORY_GROWTH flags. We can invoke this function for tests that need it.

The function would look like this:

function(prepare_emscripten_test_target TARGET_NAME)
	if(EMSCRIPTEN)
		set_target_properties(${TARGET_NAME} PROPERTIES LINK_FLAGS
			"--preload-file ${TEST_SOURCE_DIR} -sALLOW_MEMORY_GROWTH")
	endif()
endfunction()

What do you think? Should I go with this simplification instead of adding more if (EMSCRIPTEN) branches?

@botovq
Copy link
Contributor

botovq commented Jun 7, 2024

I like it.

introduce prepare_emscripten_test_target function to setup `--preload-file` and `-sALLOW_MEMORY_GROWTH` linker flags for test targets
Copy link
Contributor

@botovq botovq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still fine with this modulo one small request.

tests/CMakeLists.txt Outdated Show resolved Hide resolved
when built in asan mode, x509_asn1 fails with an Out Of Memory error because of higher memory usage of address sanitizer. This commit sets `-sALLOW_MEMORY_GROWTH` for that target by calling prepare_emscripten_test_target
@MoustaphaSaad
Copy link
Contributor Author

Note: When enabling ASAN with Emscripten, the ssl_get_shared_ciphers test fails with the following error:

READ of size 4 at 0x124b53b4 thread T0
    #0 0x1a95f5 in strlcpy ../../../../../../../emsdk/emscripten/system/lib/libc/musl/src/string/strlcpy.c:24:33
    #1 0x1a93e7 in strlcat ../../../../../../../emsdk/emscripten/system/lib/libc/musl/src/string/strlcat.c:8:13
    #2 0x4f83 in main ../../ssl/ssl_lib.c:1735:7
    #3 0x800017da in Module._main /home/runner/work/portable/portable/build/tests/ssl_get_shared_ciphers.js:6106:102

This issue does not occur with regular Linux ASAN builds. It seems to be related to Emscripten's implementation of strlcpy. I found a similar issue reported here, which was supposedly fixed 4 years ago. I'm not sure why this problem persists.

@MoustaphaSaad MoustaphaSaad requested a review from botovq June 7, 2024 17:56
@botovq
Copy link
Contributor

botovq commented Jun 7, 2024

I don't know how to fix that. That string is very unlikely to overflow the buffer as it is much larger than the concatenation of all ciphers we have.

Maybe it's worth a try to disable the strlcpy/strlcat detection and force use of our compat implementations for Emscripten?

portable/CMakeLists.txt

Lines 223 to 231 in 46ddd6b

check_function_exists(strlcat HAVE_STRLCAT)
if(HAVE_STRLCAT)
add_definitions(-DHAVE_STRLCAT)
endif()
check_function_exists(strlcpy HAVE_STRLCPY)
if(HAVE_STRLCPY)
add_definitions(-DHAVE_STRLCPY)
endif()

strlcpy and strlcat Emscripten implementations cause ASAN errors. This commit disables strlcpy and strlcat detection and uses the compat implementations instead.
@MoustaphaSaad
Copy link
Contributor Author

@botovq Awesome, I tested that on my machine and that indeed fixed the ASAN errors, Thanks for the nice idea

@botovq
Copy link
Contributor

botovq commented Jun 10, 2024

Nice, thank you - all green now.

I'd say this is an acceptable workaround for the time being. The actual compat symbols used are prefixed with libressl_, so this will not conflict if someone should link in another C module using them.

It would be nice to root cause this at some point and see if upstream (or perhaps musl?) could fix this. I don't think we're doing anything out of the ordinary in this test and if Emscripten's versions break with that, someone's made a bad mistake (either us, Emscripten or musl).

Pending further requests or okays from @busterb and @joshuasing I think we can merge this soon.

.github/workflows/emscripten.yml Outdated Show resolved Hide resolved
@botovq
Copy link
Contributor

botovq commented Jun 19, 2024 via email

@botovq
Copy link
Contributor

botovq commented Jun 19, 2024

Somehow github didn't recognize the merge as it usually does. The code is now in the master branch.

@botovq botovq closed this Jun 19, 2024
@MoustaphaSaad
Copy link
Contributor Author

@MoustaphaSaad many thanks for your work, I have merged this now. I have kept the ChangeLog entry very minimal for now. Feel free to make a PR adding a couple of lines there, maybe adding a few hints to the README.md with one or two hints on how to use this.

Awesome, Thank you
Will surely add Emscripten entries to the README.md file

@MoustaphaSaad MoustaphaSaad deleted the fix-wasm-build branch June 20, 2024 18:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants