-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Scikit learn library not working with n_jobs for RandomForestClassifier? #69
Comments
Yes, of course, it is possible. Gramine fully supports multi-threaded workloads.
You seem to be doing it correctly in the manifest template. Both options that you showed are correct. The first option (option a) is insecure but allows fast checking. The second option (option b) is more secure but requires rebuilding of the SGX app each time you want to change the number of threads. I'm not sure what you're doing wrong. Could you do the option a and then show us the commands that you run and the outputs that you get? I would assume you do smth like Also, did you try |
Hi, thanks for the quick answer! :)
My results are as followed and don't improve as the python execution inside SGX for both, gramine-sgx and gramine-direct: For my application im measuring only training time of Random Forest, like this:
That means only inside SGX the measurement takes place. Python gramine-sgx gramine-direct As you can see, there is no performance change measurable. Is there probably an error in my bash script? Thanks a lot! |
Something is definitely broken in your setup. Could you please show us two things:
We need more info to debug this issue. |
Hi, here is my manifest.template:
My experiment:
creates following Output:
i have attached my log-file. |
For my output the columns are:
Train and test time are measured in nanoseconds. |
I don't see anything suspicios in the logs. Where do you run this workload? Is this a bare metal machine, or some VM? |
I run it on a computer-server with following CPU: Intel(R) Xeon(R) Platinum 8352S CPU @ 2.20GHz. It is a bare metal machine. |
Can you set the number of threads programatically in your script? Just to experiment, because I see no reason for |
Sure, just did it with
Results above:
Results now:
Still, there is only one core used. Here also the log-file |
No, sorry, that's not what I meant. By programatically, I meant literal code snippet in your Python script. In other words, setting |
I see 👍 I just did this and the Output remains the same unfortunately. Is there anything else you can think of to check? |
I find it weird. Unfortunately, at this point you'll need to go very deep in what happens in I would also suggest to start with |
Hello,
i am trying to execute ML workloads from scikit-library and want to see what performance benefits inside the enclave by using the n_jobs parameter for RandomForrestClassifier. In native execution with Python I have a performance improvement roughly of 50% when doubling the number of threads (n_jobs). However, for the execution inside SGX with Gramine I get the same performance for 1, 2, 4, 8, and 16 threads / jobs. Is it possible to use n_jobs inside SGX with Gramine?
For my manifest template I tried both, passing loader.env.OMP_NUM_THREADS with the environment variable and assigning it directly:
a)
loader.insecure__use_host_env = true
loader.insecure__use_cmdline_argv = true
loader.env.OMP_NUM_THREADS = { passthrough = true }
b)
loader.env.OMP_NUM_THREADS = "8"
Besides that, I have set sgx.max_threads=128 and increased my enclave_size to 64GB. From scikit-learn I understand that n_jobs can also be set with OMP_NUM_THREADS. Am I doing something wrong in the manifest template or is it just not possible to use n_jobs with Gramine?
Thanks in advance and best regards,
Robert Kubicek
The text was updated successfully, but these errors were encountered: