-
Notifications
You must be signed in to change notification settings - Fork 261
numpy fails while trying to run in python #267
Comments
These are two issues (probably) unrelated to C libraries needed by Graphene-SGX.
By default, Graphene-SGX allocates four enclave threads. There is a special knob in the SGX manifest:
Agreed. I would also be interested in trying more Python workloads. Feel free to share your repo where you experiment with Python+Numpy so we can provide more direct feedback. P.S. Graphene-SGX has a clear warning/error message if it cannot find a shared library. So usually you will notice the missing library in the Graphene-SGX's output. For debugging purposes, it is also helpful to enable more verbose output |
Possibly, Python tries to create a new process under the hood to find the location of the shared library to load (code in cpython/Lib/ctypes/util.py). To get more insights, you may want to compile cpython from source with debug info and run it under gdb (Graphene has gdb support). This should hopefully bring you closer to the question why it fails. Monitoring the system calls as Dmitrii suggested may also bring you closer to the problem. I have no idea how pyinstaller works, but if it is just packing all the dependencies into a single directory it won't be of help in this scenario. The python runtime probably still fork()s to discover the shared libraries to load. |
I agree with these comments. You might try to figure out how many threads this application requires on Linux and increase the thread_num parameter until the TCS error goes away. Let us know whether you are still having issues after that. |
Thanks so much everyone! I increased the resources as suggested by making the following changes:
In addition, I had to add some other .so library files to trusted files. Then it worked without any issues! I was able to train a random forest using sklearn, pandas, and numpy libraries on some cancer-cell image dataset :D Given we have professor Porter on the thread: may I ask whether there is any known easy way to implement sealing and remote attestation (well at least seaing) inside graphene enclave? I tried something very similar to this library: https://github.com/adombeck/python-sgx by running some C code inside graphene and pass to python. But no luck--the program fails with no output. I probably should start another issue to discuss that topic. Thanks again for help. |
@chiache ? I do not believe we support sealing, but there is some support for remote attestation. |
https://github.com/cloud-security-research/sgx-ra-tls/blob/master/README.md maybe this could help? There are examples how to use the RA-TLS library with Graphene. Doing it from within Python should be doable with minimal effort. In fact, the repo demonstrates how to do it from within Python but for SGX-LKL instead of Graphene. |
Oh wow! This is very helpful @thomasknauth 👍 Thanks so much. I will take a look. Sealing would be incredibly helpful as that's one way to persist say some sort of key or certificates on a server that we do not trust by design. I will keep trying the swig based solution that I am rooting for i.e. compile a library file that can call the underlying I wonder if anyone has ever been successful doing this? I will keep everyone updated here and issue #157 Thanks again for helpful pointers |
One quick issue add--nothing major but I still want to report. When I run some not-very-basic ML algorithm inside graphene, I get the following warning messages (and I get a ton of them):
I am wondering what the can I pass-through mappings are here? Can anyone enlighten me about those? Also, does it have any potential performance issues? p.s. this is the piece of code I ran inside graphene:
|
A pass-through mapping writes output to a file on the untrusted host. The issue is that Graphene neither encrypts this output, nor integrity checks it if you re-read the contents of the file. These are features that should either be added to Graphene, or be part of a larger system that provides an encrypted and integrity-checked file system. Their absence doesn't create a performance problem, but does create an attack vector on an adversarial host. At this point, our goal in that warning is to, well, warn the user about the state of things. |
Thanks a lot for the clarification. Good to know the security risks. |
Hi, @donporter sorry for having so many problems. When I want to run the python with numpy library, I
Though I have checked Issue#179, and tried to add ''asm(".symver realpath,realpath@GLIBC_2.19");" in the py file but it still doesn't work, giving the same error message as before. I will really appreciate if someone can help me about how should I recompile/run the file against glibc-2.19. Thank you. |
I think you will either need a patched glibc 2.23, or to get a version of libm that is compiled against glibc 2.19. I might consider just downgrading to an older version of ubuntu in the interest of getting things working. I'm guessing you are on 18.04? |
Thanks for your reply! I will check about both ways. I am running on 16.04 but the glibc is just 2.23... |
We are running some numpy code with Graphene-SGX without the problems you mentioned. It may help to use virtualenv to install numpy and use that instead of the package version. Thinking about it, that was only one of the problems we ran into. I don't think we are ready to share our modified version of Graphene just yet. |
Thank you @thomasknauth . I will try again with virtualenv. |
Hey @tarikova I keep getting the following everytime I try to import an ML package (scikit, tf, keras) whose deps I matched in manifest.
and when I remove the import statement everything else runs file. |
@Khallu Did you look into our Python examples: |
@dimakuv yes, I ran them too. But I'm unable to run use python tf or keras module. I get the above mentioned error even though I have mapped the shared objects that are necessary to run in the manifest. |
@Khallu , could you explain a bit more about your issue? Do you have the file |
@dimakuv I'm using the same manifest. I'm adding the dependencies that I'd require for tensorflow and I add them to the makefile, like this to PY_LIBS
But when I run the python scripts (with the import tensorflow in it) using graphene. I get The name of the manifest in both cases is python.manifest.sgx. But trying to import tf makes it looks for python3.6 as the exec |
Ah, that's interesting. It looks like |
Thanks, that worked for that specific issue I suppose. But now, the program while executing leads my machine to log-out and reboot automatically. I'm unsure why. Could it be due to a memory issue? The following are the general options in the manifest:
|
It works with 32 as the thread no. But now I get this (after I added the tupletable and cputable as trusted files)
|
Now this looks like a bug in Graphene :) Could you create a minimal test case? And attach the required Makefile + manifest.template to your comment? At this point, we need to debug and understand the root cause. |
Okay, I'll try to reproduce the issue and post the test case. |
I am working on a project that entails running python scientific computing stack inside intel SGX enclaves. I found Graphene very helpful in terms of running python programs. While I can run all the native libraries (written fully in python) using the manifest that comes with
/LibOS/shim/test/apps/python
, when I try to run some slightly more complex i.e. libraries with underlying compiled c library (.so files) dependencies, the program fails.I am trying to run this simple program:
I tried two different solutions:
Changed manifest file to mount the path to the python libraries and added allowed_files the full library folder
dist_packages
(which includes all the python libraries so that dependencies are not an issue)Because I realized that there might be some additional dependencies or .so library files that I might be missing, I tried this program called pyinstaller (https://www.pyinstaller.org/). This compiles a single python program in an executable with added .so depndencies so that one can run this in any computer (with similar OS). The nice thing is that it shows me all the .so files I need to run the program. However, after running this I get the same error as method 1.
Both of these methods end with the following error:
I tried reading the underlying thread_map code in graphene (Pal/src/host/Linux-SGX/sgx_thread.c), but could not figure out a fix for this.
I would really appreciate if you could help me find a solution or point me towards some ways that I might try. I think if we can run scientific computiing stack (numpy, pandas, scikit, sklearn) inside graphene it would be of tremendous benefit to the scientific community or any parties who are trying to run secure computation over some untrusted servers/need guarantees of their data security.
Manifest:
The text was updated successfully, but these errors were encountered: