-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] SageMaker job as Studio kernel #15
Comments
Hi, Alex! It's definitely an interesting idea. I will do some research for the best route to take here. In a meantime, there's already an API that will help you to achieve the same results. You can run the following code in notebook cells: proxy = ssh_wrapper.start_ssm_connection(11022) proxy.run_command_with_output("ps xfa") proxy.disconnect() Let me know if it helps and I will update the documentation accordingly. |
Thanks Ivan, this is certainly helpful but I guess I'm hoping it's possible to do a bit more... First, AFAICT the current implementation of Second, I guess it's more a question of the intended overall workflow for drafting training (/processing/etc) script bundles in Studio JupyterLab and how this tool would fit in. I'm thinking of SSH Helper mainly as a workaround for lack of local mode in SMStudio & some limitations of warm pools: Looking for a way to iterate quickly on the scripts in JupyterLab UI and try running them in training/processing job context, finding and fixing basic functional/syntax errors without having to wait for infrastructure spin-up. Features that seem helpful to me in this context include:
I raised this issue with (2) originally in mind, but thinking that magics could be used to provide (3) and (4) too: Main goal to provide a super easy-to-use way (after the initial platform SSM/etc setup is done of course) for JuptyerLab-minded scientists to iterate on their scripts until they functionally work, before quitting the interactive training/processing job and running a "proper" non-interactive one to run the training/processing in a known reproducible way. Appreciate that there are other use-cases for SSH Helper of course (like diagnosing processes/threads/etc in an actually ongoing job) - I'm just wondering if it has potential to deliver a purpose-built, friction-free script debugging experience from Studio. |
Lately I work mainly in SageMaker Studio, and I'd really like to be able to debug / interact with a running job using the same UI.
Solution idea
Create a custom Studio kernel image using an IPython extension and/or custom magic through which users can connect to a running SSH Helper job and run notebook cells on that instead of the Studio app.
The user experience would be something like using EMR clusters in Studio:
mi-1234567890abcdef0
%load_ext sagemaker_ssh_helper.notebook
to initialize the IPython extension%sagemaker_ssh connect mi-1234567890abcdef0
to connect to the instance%%local
cell magic is used: Same as how SageMaker Studio SparkMagic kernel works%sagemaker_ssh disconnect
command would also be usefulSince the
sagemaker_ssh_helper
library is pip-installable, it might even be possible to get this working with default (e.g.Data Science 3.0
) kernels? I'm not sure - assume it depends how much hacking is possible during IPython extension load vs what needs setting up in advance.Why this route
To my knowledge, JupyterLab is a bit more fragmented in support for remote kernels than IDEs like VSCode/PyCharm/etc. It seems like there are ways to set up SSH kernels, but it's also a tricky topic to navigate because so many pages online are talking about "accessing your remotely-running Jupyter server" instead. Investigating the Jupyter standard kernel spec paths, I see
/opt/conda/envs/studio/share/jupyter/kernels
exists but contains only a singlepython3
kernel which doesn't appear in Studio UI. It looks like there's a customsagemaker_nb2kg
Python library that manages kernels, but no obvious integration points there for alternative kernel sources besides the studio "Apps" system - and sufficiently internal/complex that patching it seems like a bad idea....So it looks like directly registering the remote instance as a kernel in JupyterLab would be a non-starter.
If the magic-based approach works, it might also be possible to use with other existing kernel images (as mentioned above) and even inline in the same notebook after a training job is kicked off. Hopefully it would also enable toggling over to a new job/instance without having to run CLI commands to change the installed Jupyter kernels.
The text was updated successfully, but these errors were encountered: