There are several ways to set-up and run the project:
Important notes:
The easiest way to view the course content is via the static nbviewer. While this does not allow you to change and execute code, it also doesn't require you to install software locally and only needs a browser.
When working on the labs we recommend you copy the notebooks to a folder on your Google Drive and open them as a Colab notebook. A thorough guide can be found here. A Colab notebook will run on Google's servers and will allow you access to limited Google Cloud Compute GPUs for your model training and validation. Furthermore, the lab notebooks will be covered using Colab, so environments and tools will be similar.
To be able to make changes to the book and render it dynamically, we recommend you use Docker.
We assume you have a command line interface (CLI) in your OS
(bash, zsh, cygwin, git-bash, power-shell, etc.). We assume this CLI sets
the variable $(pwd)
to the current directory. If it doesn't replace
all mentions of $(pwd)
with the current directory you are in.
For Mac and Windows, go to the docker webpage and follow the instruction for your platform. Instructions for Ubuntu can be found here.
Next, you can download the stat-nlp-book
docker image like so:
docker pull bjerva/stat-nlp-book:ndak18000u
If you get a permission error here and at any later point, try prepending sudo
to the command:
sudo docker pull bjerva/stat-nlp-book:ndak18000u
This process may take a while, so use the time to start familiarising yourself with the structure of the course.
You can use the git installation in the docker container to get the repository:
docker run -v "$(pwd)":/home/jovyan/work bjerva/stat-nlp-book:ndak18000u git clone https://github.com/coastalcph/nlp-course
Note: this will create a new nlp-course
directory in your current directory.
We assume from here on that you are in the top level nlp-course
directory:
cd nlp-course
Note: you need to be in the nlp-course
directory every time you want to run/update the book.
docker run -it --rm -p 8888:8888 -v "$(pwd)":/home/jovyan/work bjerva/stat-nlp-book:ndak18000u
You are now ready to visit the overview page locally through the installed book .
Once installed you can always run your notebook server by first changing
into your local nlp-course
directory, and then executing:
docker run -it --rm -p 8888:8888 -v "$(pwd)":/home/jovyan/work bjerva/stat-nlp-book:ndak18000u
This is assuming that your docker daemon is running and that you are
in the nlp-course
directory. How to run the docker daemon
depends on your system.
We frequently make changes to the book. To get these changes you
should first make sure to clean your local changes to avoid merge
conflicts. That is, you might have made changes (by changing the code
or simply running it) to the files that we changed. In these cases git
will complain when you do the update. To overcome this you can undo all
your changes by executing:
docker run -v "$(pwd)":/home/jovyan/work bjerva/stat-nlp-book:ndak18000u git checkout -- .
If you want to keep your changes create copies of the changed files. Jupyter has a "Make a copy" option in the "File" menu for this. You can also create a clone of this repository to keep your own changes and merge our changes in a more controlled manner.
To get the actual updates then run
docker run -v "$(pwd)":/home/jovyan/work bjerva/stat-nlp-book:ndak18000u git pull
If you cannot use Docker, you can alternatively set up the book directly.
git clone https://github.com/coastalcph/nlp-course
Enter the cloned directory:
cd nlp-course
and create the virtual environment:
python -m venv nlp_venv
source nlp_venv/bin/activate
pip install --upgrade pip
MacOS: Install rust
curl https://sh.rustup.rs -sSf | sh
MacOS: Install xcode
xcode-select --install
pip install -r requirements.txt
jupyter-nbextension install rise --py --sys-prefix
jupyter-nbextension enable rise --py --sys-prefix
(the UI of the server will be opened automatically)
jupyter notebook
The repository contains a lot of material, some of which may not be ready for consumption yet. This is why you should always access content through the top-level overview page (local-link).
Receive notifications for new updates by "Watch" -ing the repo.