Skip to content

How to SSH into computers in Glen's lab

simondemeule edited this page Apr 3, 2023 · 12 revisions

Introduction

Glen's lab has a few machines we can use to work. These are specced to deal nicely with the machine learning tasks you might be playing with. They are mainly meant for prototyping and development. Unlike Compute Canada and Mila, there are very little security restrictions that prevent you from installing the software packages you need to start working, or to build virtual environments. However, the lab machines provide a limited amount of compute shared between us, so they can't support very heavy experiments; the bigger clusters are better suited for this.

The easiest way to use these machines is likely through VSCode's SSH extension. This allows you to use a remote machine as if you were running VSCode directly on it — the file explorer, debugger, and all usual features work as expected.

If you need to run something overnight, be sure to use some sort of terminal multiplexing software (ie tmux or screen) to ensure your program doesn't exit if your SSH connection cuts out.

Since we want the machines to be available at all times remotely, please never shut them down. We are looking into a solution that would allow us powering them on remotely, but at the time, someone must be physically present to power them back on.

Since there may be other users logged into the machine you are connected to, be sure to check users and nvidia-smi to see if there are programs running on the machine you picked to avoid poorly distributing the workload across the machines, and to avoid crashing other member's software by running out of memory.

Currently, the computers are hooked up to a network managed by DIRO. We are unable to directly port forward them, or setup our own login node at the moment. This might change in the future. To access the machines, we must first SSH into DIRO's login node. We then SSH into our machines from there.


Setup the login node

Before you can access the lab computers, you must first be able to access the login node.


0.0 : Create your login node account

Follow the instructions here to get a username and password for the arcade.iro.montreal.ca login node. You need to have setup your UNIP-based account at University of Montreal for this to work.


0.1 : Connect to the login node

Check that the credentials are working by logging into the node. On your local machine, add the following alias to your SSH configuration. This file is found at ~/.ssh/config; create it if it doesn't already exist.

Host arcade
   HostName arcade.iro.umontreal.ca
   User $LOGIN_NODE_USERNAME
   PreferredAuthentications password

This sets up an alias to the login node that allows you to connect easily by running

ssh arcade

You will be prompted for the password given to you when you created your account in the previous step.


0.2 : Change your login node password

Change your password on the login node by running

passwd

Setup the lab machines

Once the login node is stup, you can start setting up the lab machines.

For your own convenience and for security reasons, we ask that you setup public key authentication. Password authentication should only be enabled momentarily to setup your encryption key.


1.0 : Get an account on the machines

If you don't already have an account on the machine you want to connect to, you'll have to create one. You can reach out to Simon or someone else you know already has an account. This is easy to do if both of you are at the lab. It can also be done remotely.


1.0+ : Creating an account for a new lab member remotely

If you already have an account on a lab machine and want to create one for a new lab member, you can do so remotely by first running

sudo adduser $NEW_USERNAME

to create the account, and then

sudo usermod -aG sudo $NEW_USERNAME

to promote it to administrator. For security, have the new lab member change the password as soon as they get access to the machine.


1.1 : Get a public-private key pair

You will need to use an encryption key pair to access the lab machines. If you already have one you wish to use with this, you can skip this step. Otherwise, run

ssh-keygen

and respond to the interactive prompt to create a key pair.


1.2 : Install the encryption key on the lab machine

You will need to copy your public key to the lab machine for the login to work. For this to work, password authentication must be allowed on the lab machine; we always disable this except when copying new encryption keys for new users. Reach out to Simon or another member who is familiar with the machines to set this up for you.

Before you do this, setup a more convenient way to log into the lab machine remotely by adding these aliases to your SSH config (example shown for the green, blue, red and rainbow lab machines):

Host blue
   ProxyJump arcade
   HostName 172.19.8.33
   User $LAB_MACHINE_USERNAME
   PreferredAuthentications password

Host green
   ProxyJump arcade
   HostName 172.19.8.32
   User $LAB_MACHINE_USERNAME
   PreferredAuthentications password

Host red
   ProxyJump arcade
   HostName 172.19.8.54
   User $LAB_MACHINE_USERNAME
   PreferredAuthentications password

Host rainbow
   ProxyJump arcade
   HostName 172.19.8.55
   User $LAB_MACHINE_USERNAME
   PreferredAuthentications password

Note the use of the ProxyJump keyword: this is telling ssh to connect to the lab machines by first connecting through the login node we defined an alias for earlier. This means you will be asked for your login node password first when you use these new aliases.

Once this is done, you can copy your SSH key. If the key you will be using for this is in the default location (~/.ssh/id_rsa.pub), you can simply run

ssh-copy-id $SSH_ALIAS_TO_LAB_MACHINE

If the key you are using is not in the default location, specify it with the -i argument;

ssh-copy-id -i $LOCATION_OF_YOUR_PUBLIC_KEY $SSH_ALIAS_TO_LAB_MACHINE

The key copying utility will first ask for your login node password, and then your lab machine password.


1.2+ : Temporarily enabling password authentication to allow a new lab member to copy their encryption key

You can temporarily enable password authentication to allow new lab members to copy their encryption key.

1.2+.0. Open sshd's config file at /etc/ssh/sshd_config with administrative privileges. Replace PasswordAutentication no with PasswordAutentication yes.

1.2+.1. Restart sshd to reload its settings using sudo service sshd restart

1.2+.2. Wait for the key copy to be done.

1.2+.3. Open sshd's config file at /etc/ssh/sshd_config with administrative privileges. Replace PasswordAutentication yes with PasswordAutentication no.

1.2+.4. Restart sshd to reload its settings using sudo service sshd restart

For security, password authentication should never be allowed unless we need it temporarily, for this use specifically. Do not leave it enabled.


1.3 : Make your config use the encryption key and verify it works

Edit your SSH config to use public key login rather than password login (note the PreferredAuthentications attribute):

Host blue
   ProxyJump arcade
   HostName 172.19.8.33
   User $LAB_MACHINE_USERNAME
   PreferredAuthentications publickey

Host green
   ProxyJump arcade
   HostName 172.19.8.32
   User $LAB_MACHINE_USERNAME
   PreferredAuthentications publickey

Host red
   ProxyJump arcade
   HostName 172.19.8.54
   User $LAB_MACHINE_USERNAME
   PreferredAuthentications publickey

Host rainbow
   ProxyJump arcade
   HostName 172.19.8.55
   User simon
   User $LAB_MACHINE_USERNAME
   PreferredAuthentications publickey

You should now be able to access the lab machine using ssh $SSH_ALIAS_TO_LAB_MACHINE without being prompted for a password on the lab machine. You will still be prompted for the login node password, however.


1.4 : Make the login node password less annoying (macOS & Linux only)

Having to enter your login node password every time you log into a lab machine is annoying. Unfortunately, DIRO's login node uses Kerberos as its login manager, which makes public key authentication unusable. At the time, we are stuck with password-based authentication. DIRO suggests setting up SSH multiplexing to merge all SSH sessions into one, and to make that session persist some amount of time even after we close all of the sessions it contained. This is a weird solution, but it does make our life a bit easier. SSH multiplexing is not supported by Windows at the moment, so this is a macOS and Linux exclusive solution, unfortunately.

Edit your SSH config to obtain the following (we specifically care about the ControlPath, ControlMaster and ControlPersist attributes):

Host arcade
   HostName arcade.iro.umontreal.ca
   User $LOGIN_NODE_USERNAME
   PreferredAuthentications password
   ControlPath ~/.ssh/control/%r@%h:%p
   ControlMaster auto
   ControlPersist 60m

For this to work, the ~/.ssh/control/ directory must exist; you will need to create it. The session persistance time can be adjusted as you wish by changing the value associated with the ControlPersist attribute.

If the connection to the login node drops out, multiplexing is sometimes slow to recognize this and start a new session. You can force the session to exit using this:

ssh -O exit arcade