Skip to content

material for the programming tasks of MasterMath course "parallel algorithms" 24/25.

License

Notifications You must be signed in to change notification settings

jthies/par_algo2024

Repository files navigation

Parallel Algorithms 2024/2025

This repository contains material for the programming tasks of MasterMath course "parallel algorithms" 24/25. It is a dynamically extended tutorial that will take you from first steps (installing the Fortran compiler and libraries) to being able to write parallel programs and run them on a supercomputer yourself.

If you have never worked with git, it may be helpful to do this basic tutorial.

Working with the repository

In a terminal (either on your own computer or a cluster like DelftBlue), first clone the repository to get a local copy:

git clone https://github.com/jthies/par_algo2024

Whenever we announce an update via the elo course page, you need to update your local copy:

git pull

If you have local changes to files, this may result in a message:

error: cannot pull with rebase: You have unstaged changes.
error: Please commit or stash them.

The git status command shows you what files have local changes. You can run git stash to move these changes out of the way for the pull. If you then want to re-do them (after the pull), simply type git stash pop.

Note: You are free to commit your work to the local repository or a fork on github, but you cannot push to the repo itself. Committing your work locally makes sure you have the history and a backup in case you break your programs and is therefore highly recommended. Having local commits will typically not cause conflicts with the git pull as we only add new files for new tasks.

0. Basic programming environment

We recommend that you set up a working environment on your own computer. During the course, you will also get access to the DelftBlue supercomputer, where you do not need to install additional software.

The programming language used throughout the course is Fortran 2018. It has native support for parallel programming, so in principle all you will need is a Fortran compiler. We recommend using the GNU compiler gfortran. To enable actual parallel execution of your code, it relies on the OpenCoarrays library, which you have to install separately.

To install all you need in an Ubuntu terminal, you can use:

sudo apt install build-essential libopencoarrays-openmpi-dev

On Windows, you can configure the "Windows Subsystem for Linux" WSL to use Ubuntu as Linux distribution (and this should be the default). On other Linux systems, the package manager may be something other than apt, on MacOS the command is called brew and requires the Homebrew package manager to be installed. Package names may also differ between distributions. Make sure the opencoarrays package you install has MPI support, here indicated by the -openmpi in the name.

See this page for general instructions for various systems.

On the DelftBlue supercomputer, you do not need to install anything. Instead, you should load the required modules and you're good to go:

module load 2024r1 openmpi opencoarrays

Makefiles

Compiling a program (especially if it consists of multiple source files) can require (a series of) commands that are hard to memorize and specific to the programming environment. The Linux command make allows you to put the rules for "building" your program into a file called Makefile, and then uses these rules to build the program from the source code.

In this repository you will find a fully functional Makefile geared towards use on DelftBlue. In principle, it should also work on your own computer, though.

1. Hello World!

You find our first program in main_hello.f90. You can compile it using:

make main_hello.x

To run it with 4 processes, use:

mpirun -np 4 ./main_hello.x

Look at the program and output it produces. Do you notice anything unexpected?

Task: re-order the output

Can you adapt the program so that all "Hello" lines are printed before the "Goodbye" lines?

2. Dot Product

The second program is slightly more complicated. It consists of two source files: dotprod.f90 and main_dotprod.f90. The former is a Fortran module that provides several implementations of a function to compute the inner product of two distributed vectors (coarrays). The latter contains a program that will run these variants and print timing results.

  • Look at both source files and understand (at least roughly) how this program works, and what the different variants are.
  • compile and run the program "as is" using
make main_dotprod.x
mpirun -np 4 ./main_dotprod.x 1000000

You can adapt the number of processes if you have more than 4 cores available. In this example we use one million vector elements in total, you can change this as well.

Task: Implement the gatherbcast variant

You may have noticed that the gatherbcast variant is not implemented yet. Do this in dotprod.f90 and run the program again to test and time it. The way the variant works is sketched in the comments of that function.

Task: Implement your own variant

Implement at least one additional variant of your own. You can also try to generalize the "butterfly" variant, which currently only works if the number of processes is a power of 2. One idea for your own variant is to replace the sync all statements (partly) by syncrhonization between two processes (sync images(...) in Fortran).

Run the main program for the maximum number of processes available on your computer and different values of the vector length N. Which variant is the best?

3. Parallel Sample Sort

The paralllel sorting algorithm (described in Section 1.8 of the book by Rob Bisseling) is implemented as sort_coarray in the file sorting.f90. A corresponding driver routine is available in main_sorting.f90. It takes a single integer as command-line argument: The (approximate) number of elements to be assigned to each process. Your input will be rounded up to a multiple of P^2, where P is the number of processes (images) running the program.

Unit testing

This is a slightly more complex program than we saw before: sort_coarray calls other functions like quicksort and mergeparts. To assure that all components work properly, it is useful to employ a test framework. Here we will use fortuno. This is a fortran library that allows you to easily write and execute small test programs. On DelftBlue we have installed it for you, and the Makefile has been extended with targets main_test.x and test. If you type make test, the test driver is built and executed for a varying number of processes.

Unfortunately, the co-array extension of fortuno did not work on my Ubuntu laptop. I would therefore presently not recommend installing it but trying it out on DelftBlue instead.

"Tasks"

  • Use the dot product and sorting examples to familiarize yourself with Fortran, DelftBlue job submission, make files (e.g., try switching to debugging options for your build).
  • There may be bugs in the sorting program (every program always has them...). If you encounter one, you may try to add unit tests that isolate it and then fix the problem. A patch or pull request with a bug fix to this repository may put your teachers in a good mood during the final oral exam...
  • Use the structure of the sorting program as inspiration for your first bigger programming task: the Sieve of Eratosthenes

Note: To run this (or any) program on DelftBlue compute nodes (rather than the login node), use srun instead of mpirun. This requires a number of flags (in particular the --mem-per-cpu flag), that are most conveniently set in a job script. See benchmarks.slurm for an example of a job script, and the DelftBlue documentation for many more. For quick tests, do not use the --exclusive flag as it will request a full node, which may make your waiting times long.

About

material for the programming tasks of MasterMath course "parallel algorithms" 24/25.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published