Our team has a cloud-based JupyterHub which is open for use by all team members.
Hub Address | https://m2lines.2i2c.cloud/ |
Hub Location | Google Cloud us-central1 |
Hub Operator | 2i2c |
Hub Configuration | https://github.com/2i2c-org/infrastructure/tree/master/config/clusters/m2lines |
For questions about how to use the Hub, please open an issue in this repo:
Ryan will respond to your issue and decide whether to refer it to 2i2c for technical support.
This is a rough and ready guide to using the Hub. This documentation will be expanded as we learn and evolve. Feel free to edit it yourself if you have suggetions for improvement!
-
👀 Navigate to https://m2lines.2i2c.cloud/ and click the big orange button that says "Log in to continue"
-
🔐 You will be prompted to authorize a GitHub application. Say "yes" to everything. Note you must belong to the m2lines GitHub organization in order to access the hub. (Johanna can add you.)
-
📠 You will redirect to a screen with the following options. Choose which machine type you want to use.
You should try to use the smallest image you need. The GPU images should be used only when needed to accelerate model training.
-
🕥 Wait for your server to start up. It's normal for this to take a few minutes.
After your server fires up, you will be dropped into a JupyterLab environment.
If you are new to JupyterLab, you might want to peruse the user guide.
Your server will shut down automatically after a period of inactivity. However, if you know you are done working, it's best to shut it down directly. To shut it down, go to https://m2lines.2i2c.cloud/hub/home and click the big red button that says "Stop My Server"
You can also navigate to this page from JupyterLab by clicking the File
menu and going to Hub Control Panel
.
The Hub environment contains a full-featured, up-to-date Python environment. The environments are maintained by Pangeo. You can read all about them at the following URL:
These are Docker images which you can also run on your own computer.
The Hub contains a specific version of the image which can be found here.
For example, at the time of writing, the version of pangeo-notebook
is 2022.05.10
.
A complete list of all packages installed in this environment is located at:
There are separate images for pytorch and tensorflow.
You can install additional packages using pip
and conda
.
However, these will disappear when your server shuts down.
Data and files work differently in the cloud. To help onboard you to this new way of working, we have written a guide to Files and Data in the Cloud:
We recommend you read this thoroughly, especially the part about Git and GitHub.
To help you scale up calculations using a cluster, the Hub is configured with Dask Gateway. For a quick guide on how to start a Dask Cluster, consult this page from the Pangeo docs:
[More info to be added soon]