numpy fails while trying to run in python #267

tarikova · 2018-10-24T07:06:55Z

I am working on a project that entails running python scientific computing stack inside intel SGX enclaves. I found Graphene very helpful in terms of running python programs. While I can run all the native libraries (written fully in python) using the manifest that comes with /LibOS/shim/test/apps/python, when I try to run some slightly more complex i.e. libraries with underlying compiled c library (.so files) dependencies, the program fails.

I am trying to run this simple program:

import numpy as np

arr = np.linspace(100)

print(arr)

I tried two different solutions:

Changed manifest file to mount the path to the python libraries and added allowed_files the full library folder dist_packages (which includes all the python libraries so that dependencies are not an issue)
Because I realized that there might be some additional dependencies or .so library files that I might be missing, I tried this program called pyinstaller (https://www.pyinstaller.org/). This compiles a single python program in an executable with added .so depndencies so that one can run this in any computer (with similar OS). The nice thing is that it shows me all the .so files I need to run the program. However, after running this I get the same error as method 1.

Both of these methods end with the following error:

Cannot attach to any TCS!
Memory Mapping Exception in Untrusted Code (RIP = 55ae916d52cd)

I tried reading the underlying thread_map code in graphene (Pal/src/host/Linux-SGX/sgx_thread.c), but could not figure out a fix for this.

I would really appreciate if you could help me find a solution or point me towards some ways that I might try. I think if we can run scientific computiing stack (numpy, pandas, scikit, sklearn) inside graphene it would be of tremendous benefit to the scientific community or any parties who are trying to run secure computation over some untrusted servers/need guarantees of their data security.

Manifest:

#!$(PAL)

loader.preload = file:$(SHIMPATH)
loader.exec = file:/usr/bin/python
loader.execname = python
loader.env.LD_LIBRARY_PATH = /graphene:/graphene/resolv:/host:/usr/lib:/usr/lib/x86_64-linux-gnu
loader.env.PATH = /usr/bin:/bin
loader.env.USERNAME =
loader.env.HOME =
loader.env.PWD =
loader.debug_type = none

fs.mount.lib1.type = chroot
fs.mount.lib1.path = /graphene
fs.mount.lib1.uri = file:$(LIBCDIR)

fs.mount.lib2.type = chroot
fs.mount.lib2.path = /host
fs.mount.lib2.uri = file:/lib/x86_64-linux-gnu

fs.mount.bin.type = chroot
fs.mount.bin.path = /bin
fs.mount.bin.uri = file:/bin

fs.mount.usr.type = chroot
fs.mount.usr.path = /usr
fs.mount.usr.uri = file:/usr

fs.mount.etc.type = chroot
fs.mount.etc.path = /etc
fs.mount.etc.uri = file:

fs.mount.home.type = chroot
fs.mount.home.path = /home
fs.mount.home.uri = file:/home

sys.stack.size = 1M
sys.brk.size = 4M
glibc.heap_size = 16M

sgx.trusted_files.ld = file:$(LIBCDIR)/ld-linux-x86-64.so.2
sgx.trusted_files.libc = file:$(LIBCDIR)/libc.so.6
sgx.trusted_files.libdl = file:$(LIBCDIR)/libdl.so.2
sgx.trusted_files.libm = file:$(LIBCDIR)/libm.so.6
sgx.trusted_files.libpthread = file:$(LIBCDIR)/libpthread.so.0
sgx.trusted_files.liburil = file:$(LIBCDIR)/libutil.so.1
sgx.trusted_files.libz = file:/lib/x86_64-linux-gnu/libz.so.1
sgx.trusted_files.libnss1 = file:/lib/x86_64-linux-gnu/libnss_compat.so.2
sgx.trusted_files.libnss2 = file:/lib/x86_64-linux-gnu/libnss_files.so.2
sgx.trusted_files.libnss3 = file:$(LIBCDIR)/libnss_dns.so.2
sgx.trusted_files.libssl = file:/lib/x86_64-linux-gnu/libssl.so.1.0.0
sgx.trusted_files.libcrypto = file:/lib/x86_64-linux-gnu/libcrypto.so.1.0.0
sgx.trusted_files.libresolv = file:$(LIBCDIR)/libresolv.so.2
sgx.trusted_files.hosts = file:hosts
sgx.trusted_files.resolv = file:resolv.conf
sgx.trusted_files.gai = file:gai.conf

sgx.allowed_files.pyhome = file:/usr/lib/python2.7
sgx.allowed_files.pyhome2 = file:scripts
sgx.allowed_files.pyhome3 = file:/home/$(USER)/.local/lib/python2.7/site-packages

The text was updated successfully, but these errors were encountered:

dimakuv · 2018-10-24T17:23:10Z

Cannot attach to any TCS!
Memory Mapping Exception in Untrusted Code (RIP = 55ae916d52cd)

These are two issues (probably) unrelated to C libraries needed by Graphene-SGX.

The first line (Cannot attach to any TCS!) implies that Graphene-SGX didn't allocate enough threads at startup. Recall that in the current SGX environment, all enclave threads must be pre-allocated.

By default, Graphene-SGX allocates four enclave threads. There is a special knob in the SGX manifest: sgx.thread_num =4. See Wiki page. Try to increase the number of enclave threads to e.g. 8 in your Manifest file. Also experiment with bigger enclave sizes:

...
sgx.thread_num =4
sgx.enclave_size= 1024M

The second issue (Memory Mapping Exception in Untrusted Code) may be resolved after you resolve the first issue. But it can also be completely unrelated. If you still experience it after resolving the first issue, try to debug it (GDB=1 SGX=1 ...) and feel free to report your findings here.

I would really appreciate if you could help me find a solution or point me towards some ways that I might try. I think if we can run scientific computiing stack (numpy, pandas, scikit, sklearn) inside graphene it would be of tremendous benefit to the scientific community or any parties who are trying to run secure computation over some untrusted servers/need guarantees of their data security.

Agreed. I would also be interested in trying more Python workloads. Feel free to share your repo where you experiment with Python+Numpy so we can provide more direct feedback.

P.S. Graphene-SGX has a clear warning/error message if it cannot find a shared library. So usually you will notice the missing library in the Graphene-SGX's output. For debugging purposes, it is also helpful to enable more verbose output loader.debug_type = inline. And use strace -f ... when in doubt.

thomasknauth · 2018-10-24T18:08:24Z

Possibly, Python tries to create a new process under the hood to find the location of the shared library to load (code in cpython/Lib/ctypes/util.py). To get more insights, you may want to compile cpython from source with debug info and run it under gdb (Graphene has gdb support). This should hopefully bring you closer to the question why it fails. Monitoring the system calls as Dmitrii suggested may also bring you closer to the problem.

I have no idea how pyinstaller works, but if it is just packing all the dependencies into a single directory it won't be of help in this scenario. The python runtime probably still fork()s to discover the shared libraries to load.

donporter · 2018-10-24T19:08:15Z

I agree with these comments. You might try to figure out how many threads this application requires on Linux and increase the thread_num parameter until the TCS error goes away.

Let us know whether you are still having issues after that.

tarikova · 2018-10-24T19:42:14Z

Thanks so much everyone! I increased the resources as suggested by making the following changes:

sys.stack.size = 1M
sys.brk.size = 4M
glibc.heap_size = 16M

sgx.enclave_size = 1024M
sgx.thread_num = 16

In addition, I had to add some other .so library files to trusted files. Then it worked without any issues!

I was able to train a random forest using sklearn, pandas, and numpy libraries on some cancer-cell image dataset :D

Given we have professor Porter on the thread: may I ask whether there is any known easy way to implement sealing and remote attestation (well at least seaing) inside graphene enclave? I tried something very similar to this library: https://github.com/adombeck/python-sgx by running some C code inside graphene and pass to python. But no luck--the program fails with no output. I probably should start another issue to discuss that topic. Thanks again for help.

donporter · 2018-10-24T19:56:11Z

@chiache ?

I do not believe we support sealing, but there is some support for remote attestation.

thomasknauth · 2018-10-24T19:57:53Z

https://github.com/cloud-security-research/sgx-ra-tls/blob/master/README.md maybe this could help? There are examples how to use the RA-TLS library with Graphene. Doing it from within Python should be doable with minimal effort. In fact, the repo demonstrates how to do it from within Python but for SGX-LKL instead of Graphene.

tarikova · 2018-10-24T20:17:33Z

Oh wow! This is very helpful @thomasknauth 👍 Thanks so much. I will take a look.

Sealing would be incredibly helpful as that's one way to persist say some sort of key or certificates on a server that we do not trust by design. I will keep trying the swig based solution that I am rooting for i.e. compile a library file that can call the underlying sgx_seal_data function from sgx_tseal.h and then try to call it from python code that is running inside the enclave.

I wonder if anyone has ever been successful doing this?

I will keep everyone updated here and issue #157

Thanks again for helpful pointers

tarikova · 2018-10-25T09:05:26Z

One quick issue add--nothing major but I still want to report. When I run some not-very-basic ML algorithm inside graphene, I get the following warning messages (and I get a ton of them):

file_map does not currently support writeable pass-through mappings on SGX.  You may add the PAL_PROT_WRITECOPY (MAP_PRIVATE) flag to your file mapping to keep the writes inside the enclave but they won't be reflected outside of the enclave.

I am wondering what the can I pass-through mappings are here? Can anyone enlighten me about those? Also, does it have any potential performance issues?

p.s. this is the piece of code I ran inside graphene:

print(__doc__)
import sys
sys.path.append('/home/tarik/.local/lib/python2.7/site-packages')

from time import time
t = time()
import numpy as np

from sklearn import metrics
from sklearn.cluster import KMeans
from sklearn.datasets import load_digits
from sklearn.decomposition import PCA
from sklearn.preprocessing import scale

np.random.seed(42)

digits = load_digits()
data = scale(digits.data)

n_samples, n_features = data.shape
n_digits = len(np.unique(digits.target))
labels = digits.target

sample_size = 300

print("n_digits: %d, \t n_samples %d, \t n_features %d"
      % (n_digits, n_samples, n_features))


print(82 * '_')
print('init\t\ttime\tinertia\thomo\tcompl\tv-meas\tARI\tAMI\tsilhouette')

N = 1

def bench_k_means(estimator, name, data):
    t0 = time()
    estimator.fit(data)
    print('%-9s\t%.2fs\t%i\t%.3f\t%.3f\t%.3f\t%.3f\t%.3f\t%.3f'
          % (name, (time() - t0), estimator.inertia_,
             metrics.homogeneity_score(labels, estimator.labels_),
             metrics.completeness_score(labels, estimator.labels_),
             metrics.v_measure_score(labels, estimator.labels_),
             metrics.adjusted_rand_score(labels, estimator.labels_),
             metrics.adjusted_mutual_info_score(labels,  estimator.labels_),
             metrics.silhouette_score(data, estimator.labels_,
                                      metric='euclidean',
                                      sample_size=sample_size)))

bench_k_means(KMeans(init='k-means++', n_clusters=n_digits, n_init=10, n_jobs=N),
              name="k-means++", data=data)

bench_k_means(KMeans(init='random', n_clusters=n_digits, n_init=10, n_jobs=N),
              name="random", data=data)

# in this case the seeding of the centers is deterministic, hence we run the
# kmeans algorithm only once with n_init=1
pca = PCA(n_components=n_digits).fit(data)
bench_k_means(KMeans(init=pca.components_, n_clusters=n_digits, n_init=1, n_jobs=N),
              name="PCA-based",
              data=data)
print(82 * '_')

# #############################################################################
# Visualize the results on PCA-reduced data

reduced_data = PCA(n_components=2).fit_transform(data)
kmeans = KMeans(init='k-means++', n_clusters=n_digits, n_init=10, n_jobs=1)
kmeans.fit(reduced_data)

# Step size of the mesh. Decrease to increase the quality of the VQ.
h = .02     # point in the mesh [x_min, x_max]x[y_min, y_max].

# Plot the decision boundary. For that, we will assign a color to each
x_min, x_max = reduced_data[:, 0].min() - 1, reduced_data[:, 0].max() + 1
y_min, y_max = reduced_data[:, 1].min() - 1, reduced_data[:, 1].max() + 1
xx, yy = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min, y_max, h))

# Obtain labels for each point in mesh. Use last trained model.
Z = kmeans.predict(np.c_[xx.ravel(), yy.ravel()])

print "Took %.2f seconds" % (time() - t)

donporter · 2018-10-25T11:45:05Z

A pass-through mapping writes output to a file on the untrusted host. The issue is that Graphene neither encrypts this output, nor integrity checks it if you re-read the contents of the file.

These are features that should either be added to Graphene, or be part of a larger system that provides an encrypted and integrity-checked file system. Their absence doesn't create a performance problem, but does create an attack vector on an adversarial host. At this point, our goal in that warning is to, well, warn the user about the state of things.

tarikova · 2018-10-25T13:32:20Z

Thanks a lot for the clarification. Good to know the security risks.

Vampsj · 2018-12-26T03:57:39Z

Hi, @donporter sorry for having so many problems. When I want to run the python with numpy library, I
get the error message about the

ImportError: /graphene/libm.so.6: version `GLIBC_2.23' not found (required by /usr/lib/x86_64-linux-gnu/libquadmath.so.0)

Though I have checked Issue#179, and tried to add ''asm(".symver realpath,realpath@GLIBC_2.19");" in the py file but it still doesn't work, giving the same error message as before.

I will really appreciate if someone can help me about how should I recompile/run the file against glibc-2.19. Thank you.

donporter · 2018-12-26T18:49:12Z

I think you will either need a patched glibc 2.23, or to get a version of libm that is compiled against glibc 2.19. I might consider just downgrading to an older version of ubuntu in the interest of getting things working. I'm guessing you are on 18.04?

Vampsj · 2018-12-26T19:02:32Z

Thanks for your reply! I will check about both ways. I am running on 16.04 but the glibc is just 2.23...

thomasknauth · 2019-01-02T18:29:07Z

We are running some numpy code with Graphene-SGX without the problems you mentioned. It may help to use virtualenv to install numpy and use that instead of the package version. Thinking about it, that was only one of the problems we ran into. I don't think we are ready to share our modified version of Graphene just yet.

Vampsj · 2019-01-05T03:20:07Z

Thank you @thomasknauth . I will try again with virtualenv.

Khallu · 2019-12-19T12:33:54Z

Hey @tarikova
I'm trying to run a few ML libs inside graphene too.
Would highly appreciate it if you could share the manifest and the other setup for this script. #267 (comment) !

I keep getting the following everytime I try to import an ML package (scikit, tf, keras) whose deps I matched in manifest.

Cannot open manifest file: python3.6.manifest.sgx
USAGE: /home/khallu/graphene/Pal/src/../../Runtime/pal-Linux-SGX [executable|manifest] args ...

and when I remove the import statement everything else runs file.

dimakuv · 2019-12-19T17:22:56Z

@Khallu Did you look into our Python examples:

Khallu · 2019-12-19T17:34:12Z

@dimakuv yes, I ran them too. But I'm unable to run use python tf or keras module. I get the above mentioned error even though I have mapped the shared objects that are necessary to run in the manifest.
I also tried modifying the same example to run tf/keras/scikit, but I get the same error

dimakuv · 2019-12-19T17:56:55Z

@Khallu , could you explain a bit more about your issue? Do you have the file python3.6.manifest.sgx in your directory? What is the import statement you mention?

Khallu · 2019-12-20T05:58:45Z

@dimakuv I'm using the same manifest. I'm adding the dependencies that I'd require for tensorflow and I add them to the makefile, like this to PY_LIBS

$(PYTHONSITEHOME)/numpy/core/_multiarray_umath.cpython-$(PYTHONSHORTVERSION)m-x86_64-linux-gnu.so \
$(PYTHONSITEHOME)/scipy/sparse/_sparsetools.cpython-$(PYTHONSHORTVERSION)m-x86_64-linux-gnu.so \
$(PYTHONSITEHOME)/tensorflow/python/_pywrap_tensorflow_internal.so \
$(PYTHONSITEHOME)/h5py/h5.cpython-$(PYTHONSHORTVERSION)m-x86_64-linux-gnu.so

But when I run the python scripts (with the import tensorflow in it) using graphene. I get
Cannot open manifest file: python3.6.manifest.sgx USAGE: /home/khallu/graphene/Pal/src/../../Runtime/pal-Linux-SGX [executable|manifest] args ...
But when I remove the import statement it runs fine.

The name of the manifest in both cases is python.manifest.sgx. But trying to import tf makes it looks for python3.6 as the exec

dimakuv · 2019-12-20T19:23:25Z

Ah, that's interesting. It looks like import tf wants to spawn a new process which is not just python (a symlink) but an actual executable name python3.6. I suggest you rename your python.manifest.template to python3.6.manifest.template and change python everywhere inside your makefile/manifest to python3.6.

Khallu · 2019-12-21T13:59:12Z

Thanks, that worked for that specific issue I suppose. But now, the program while executing leads my machine to log-out and reboot automatically. I'm unsure why. Could it be due to a memory issue?

The following are the general options in the manifest:

# Graphene general options

# Graphene creates stacks of 256KB by default. It is not enough for SciPy/NumPy
# packages, e.g., libopenblas dependency assumes more than 512KB-sized stacks.
sys.stack.size = 2M

# SGX general options

# Set the virtual memory size of the SGX enclave. For SGX v1, the enclave
# size must be specified during signing. If Python needs more virtual memory
# than the enclave size, Graphene will not be able to allocate it.
sgx.enclave_size = 4G

# Set the maximum number of enclave threads. For SGX v1, the number of enclave
# TCSes must be specified during signing, so the application cannot use more
# threads than the number of TCSes. Note that Graphene also creates an internal
# thread for handling inter-process communication (IPC), and potentially another
# thread for asynchronous events. Therefore, the actual number of threads that
# the application can create is (sgx.thread_num - 2).
sgx.thread_num = 64

Khallu · 2019-12-21T15:37:53Z

It works with 32 as the thread no. But now I get this (after I added the tupletable and cputable as trusted files)

Unknown or illegal instruction at RIP 0x00000000ec829cec
Internal illegal fault at 0xec829cec (IP = 0xec829cec, VMID = 2284740226, TID = 1)
Unknown or illegal instruction at RIP 0x00000000ec829cec
Internal illegal fault at 0xec829cec (IP = 0xec829cec, VMID = 2284740226, TID = 1)
Unknown or illegal instruction at RIP 0x00000000ec829cec
Internal illegal fault at 0xec829cec (IP = 0xec829cec, VMID = 2284740226, TID = 1)
Unknown or illegal instruction at RIP 0x00000000ec829cec
Internal illegal fault at 0xec829cec (IP = 0xec829cec, VMID = 2284740226, TID = 1)
Unknown or illegal instruction at RIP 0x00000000ec829cec
Internal illegal fault at 0xec829cec (IP = 0xec829cec, VMID = 2284740226, TID = 1)
Unknown or illegal instruction at RIP 0x00000000ec829cec
Internal illegal fault at 0xec829cec (IP = 0xec829cec, VMID = 2284740226, TID = 1)
Unknown or illegal instruction at RIP 0x00000000ec829cec
Internal illegal fault at 0xec829cec (IP = 0xec829cec, VMID = 2284740226, TID = 1)
Unknown or illegal instruction at RIP 0x00000000ec829cec

dimakuv · 2019-12-30T18:33:33Z

Now this looks like a bug in Graphene :)

Could you create a minimal test case? And attach the required Makefile + manifest.template to your comment? At this point, we need to debug and understand the root cause.

Khallu · 2020-01-03T07:15:32Z

Okay, I'll try to reproduce the issue and post the test case.

tarikova mentioned this issue Oct 24, 2018

PYTHONPATH lookup failure #244

Closed

tarikova closed this as completed Oct 24, 2018

Seb5ferrari mentioned this issue Apr 10, 2019

Internal memory fault at 0x00000000 when using python numpy library. #602

Closed

matthewshine mentioned this issue May 23, 2019

IOError: [Errno 13] Permission denied when run python include pandas #709

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

numpy fails while trying to run in python #267

numpy fails while trying to run in python #267

tarikova commented Oct 24, 2018 •

edited

Loading

dimakuv commented Oct 24, 2018

thomasknauth commented Oct 24, 2018

donporter commented Oct 24, 2018

tarikova commented Oct 24, 2018

donporter commented Oct 24, 2018

thomasknauth commented Oct 24, 2018

tarikova commented Oct 24, 2018

tarikova commented Oct 25, 2018 •

edited

Loading

donporter commented Oct 25, 2018

tarikova commented Oct 25, 2018

Vampsj commented Dec 26, 2018

donporter commented Dec 26, 2018

Vampsj commented Dec 26, 2018

thomasknauth commented Jan 2, 2019 •

edited

Loading

Vampsj commented Jan 5, 2019

Khallu commented Dec 19, 2019 •

edited

Loading

dimakuv commented Dec 19, 2019

Khallu commented Dec 19, 2019

dimakuv commented Dec 19, 2019

Khallu commented Dec 20, 2019 •

edited

Loading

dimakuv commented Dec 20, 2019

Khallu commented Dec 21, 2019 •

edited

Loading

Khallu commented Dec 21, 2019

dimakuv commented Dec 30, 2019

Khallu commented Jan 3, 2020

numpy fails while trying to run in python #267

numpy fails while trying to run in python #267

Comments

tarikova commented Oct 24, 2018 • edited Loading

dimakuv commented Oct 24, 2018

thomasknauth commented Oct 24, 2018

donporter commented Oct 24, 2018

tarikova commented Oct 24, 2018

donporter commented Oct 24, 2018

thomasknauth commented Oct 24, 2018

tarikova commented Oct 24, 2018

tarikova commented Oct 25, 2018 • edited Loading

donporter commented Oct 25, 2018

tarikova commented Oct 25, 2018

Vampsj commented Dec 26, 2018

donporter commented Dec 26, 2018

Vampsj commented Dec 26, 2018

thomasknauth commented Jan 2, 2019 • edited Loading

Vampsj commented Jan 5, 2019

Khallu commented Dec 19, 2019 • edited Loading

dimakuv commented Dec 19, 2019

Khallu commented Dec 19, 2019

dimakuv commented Dec 19, 2019

Khallu commented Dec 20, 2019 • edited Loading

dimakuv commented Dec 20, 2019

Khallu commented Dec 21, 2019 • edited Loading

Khallu commented Dec 21, 2019

dimakuv commented Dec 30, 2019

Khallu commented Jan 3, 2020

tarikova commented Oct 24, 2018 •

edited

Loading

tarikova commented Oct 25, 2018 •

edited

Loading

thomasknauth commented Jan 2, 2019 •

edited

Loading

Khallu commented Dec 19, 2019 •

edited

Loading

Khallu commented Dec 20, 2019 •

edited

Loading

Khallu commented Dec 21, 2019 •

edited

Loading