All files that are associated with this course are available via this public git repository. The use of git
is encouraged, but not mandatory.
Using git, the course material can be downloaded by cloning the github repository:
$ git clone https://github.com/cforssen/BayesianMethodsNP.git
after which it will be available in the BayesianMethodsNP
directory. You will have to perform a git pull
operation when wou want to retrieve future updates in the github repository. When using the github repository to stay up-to-date with course material it is recommended to make a local copy of notebooks before starting to edit them. The easiest is probably to create a new directory with your local copies. The git approach, however, would be to create a new branch.
There are plenty of tutorials and guides how to use git. Two recommended ones are:
- The Git Handbook (a GitHub Guide),
- Pro Git (an online book by Scott Chacon and Ben Straub).
You are encouraged to set your user name and email address. This is important because every Git commit uses this information, and it’s immutably baked into the commits you start creating:
$ git config --global user.name "Emilia Student"
$ git config --global user.email [email protected]
You might also want to set your default editor. Pick the one that you prefer. E.g.,
$ git config --global core.editor emacs/vi/nano/...
We will make extensive use of Python as programming language and its myriad of available libraries. Jupyter notebooks make it possible to run Python code interactively with the immediate benefit of visualizing your data. It is also possible to run Julia or R code, or even integrate code in compiled languages like C++, Fortran etc if you prefer. The focus in this course will be on Python.
The notebooks that we will be using depend on several scientific python modules (see the list in environment.yml) and they require a python3.x installation.
These python modules and their dependencies are best installed using conda
by creating a virtual environment. The conda
package manager can easily be installed as a local user, which does not require administrator permissions. Note that the miniconda installation requires 400 MB disk space.
- Download and run the miniconda installer:
During the installation you will have to accept the license agreement. We also recommend to use the default installation directory
$ wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh $ bash Miniconda3-latest-Linux-x86_64.sh
~/miniconda3
and to allow the installer to runconda init
to set environment variables (this is done by typing yes when prompted). It is recommended to use thebash
shell. Further details and troubleshooting is available at conda.io. - Check the installation:
should give a list of python packages.
$ conda list
should give the path to the conda python executable$ which python
~/miniconda3/bin/python
andshould show that you are using a Python3 version.$ python --version
- Make sure that you have downloaded the course material, preferably with
git clone
as described above. Go to the directory with theenvironment.yml
file. - Create a virtual environment
$ conda env create -f environment.yml
which reads the environment.yml file in your current directory. It will check dependencies and then download and install several packages. It might take a considerable time to run and you will probably be asked to accept the suggested installation procedure.
Some packages that we will be using are not included in the default conda channels. One can either specify relevant package channel(s) in the environment.yml file (as done here), or add them to the default conda channel configuration via, e.g,
conda config --append channels conda-forge
-
Once the virtual environment has been created it can be activated:
$ conda activate bayesNP-env
For future reference, to deactivate the virtual environment:
conda deactivate
-
To update the virtual environment (e.g. after adding packages to the
environment.yml
file)$ conda env update -f environment.yml
There are also other options (venv
, pipenv
) for creating virtual environments with the python version and packages that we will be using. Or you can install packages, keeping track yourself of dependencies, using pip install
.
Jupyter notebooks will be used extensively in this course. It is an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text. Uses include: data cleaning and transformation, numerical simulation, statistical modeling, data visualization, machine learning, and much more.
Therefore, once the environment is set up and activated, you are encouraged to get acquainted with the notebook workflow by starting a jupyter notebook in a directory with ipynb files. E.g., in the directory tree of the cloned github repository:
$ jupyter notebook
The Jupyter Notebook will open in your default browser. It should show a list of the subdirectories.
However, a word of caution: Since Jupyter Notebooks are json files with much metadata, they do not integrate seamlessly with git version control. The action of evaluating a cell in a notebook, without changing any code, will register as a file update in git.