CI: Continuous Integration

Warning, Fresh Content

This chapter has had a big rewrite!

As always, feedback is welcome, but especially-especially since this stuff is all so new. Let me know how you get on :)

As our site grows, it takes longer and longer to run all of our functional tests. If this continues, the danger is that we’re going to stop bothering.

Rather than let that happen, we can automate the running of functional tests by setting up "Continuous Integration", or CI. That way, in day-to-day development, we can just run the FT that we’re working on at that time, and rely on CI to run all the other tests automatically and let us know if we’ve broken anything accidentally.

The unit tests should stay fast enough that we can keep running the full suite locally, every few seconds.

Note	Continuous Integration is another practice that was popularised by Kent Beck’s Extreme Programming (XP) movement in the 1990s.

As we’ll see, one of the great frustrations of CI is that the feedback loop is so much slower than working locally. As we go along, we’ll look for ways to optimise for that, where we can.

While debugging, we’ll also touch on the theme of reproducibility, the idea that we want to be able to reproduce behaviours of our CI environment locally, in the same way we try and make our production and dev environments as similar as possible.

Choosing a CI service

CI as a concept has been around since the 90s at least, and traditionally would be done by configuring a server, perhaps under a desk in the corner of the office, with software on it that could pull down all the code from the main branch at the end of each day, and scripts to compile all the code and run all the tests, a process that became know as a "build". Then each morning developers would take a look at the results, and deal with any "broken builds".

As the practice has spread, and feedback cycles have grown faster, CI software has developed (Jenkins was a popular choice in the 2010s), and CI has become a common "cloud" service, designed to integrate with code hosting providers like GitHub, or even provided directly by the same providers: GitHub has "GitHub Actions", and because of its salience, it’s probably the most popular choice for Open Source projects these days. In a corporate environment, you might come across other solutions like CircleCI, Travis, and GitLab.

It is still absolutely possible to download and self-host your own CI server; in the 1e and 2e I demonstrated the use of Jenkins. But, the installation and subsequent admin/maintenance burden is not effort-free, so for this edition I wanted to pick a service that was as much like the kind of thing you’re likely to encounter in your day job, while trying not to endorse the largest commercial provider. There’s nothing wrong with GitHub Actions! They just don’t need any more help dominating the market.

So I’ve decided to use GitLab in this book. They are absolutely a commercial service, but they retain an open-source version, and you can self-host it in theory.

The syntax (it’s always YAML…) and core concepts are common across all providers, so the things you learn here will be replicable in whichever service you encounter in future.

Like most of the services out there, GitLab has a free tier, which will work fine for our purposes.

Getting our Code into GitLab

GitLab is primarily a code hosting service like GitHub, so the first thing thing to do is get our code up there

Starting a Project

Use the New Project → Create Blank Project option, as in Creating a New Repo on Gitlab.

Figure 1. Creating a New Repo on Gitlab

Pushing our Code up using Git Push

First we set up gitlab as a "remote" for our project:

# substitute your username and project name as necessary
$ git remote add origin git@gitlab.com:yourusername/superlists.git
# or, if you already have "origin" defined:
$ git remote add gitlab git@gitlab.com:yourusername/superlists.git
$ git remote -v
gitlab    git@gitlab.com:hjwp/superlistz.git (fetch)
gitlab    git@gitlab.com:hjwp/superlistz.git (push)
origin    git@github.com:hjwp/book-example.git (fetch)
origin    git@github.com:hjwp/book-example.git (push)

Now we can push up our code with git push. The -u flag sets up a "remote-tracking" branch, so you can push+pull without explicitly specifying the remote + branch

# if using 'origin', it will be the default:
$ git push -u
# or, if you have multiple remotes:
$ git push -u gitlab
Enumerating objects: 706, done.
Counting objects: 100% (706/706), done.
Delta compression using up to 11 threads
Compressing objects: 100% (279/279), done.
Writing objects: 100% (706/706), 918.72 KiB | 131.25 MiB/s, done.
Total 706 (delta 413), reused 682 (delta 408), pack-reused 0 (from 0)
remote: Resolving deltas: 100% (413/413), done.
To gitlab.com:hjwp/superlistz.git
 * [new branch]        main -> main
branch 'main' set up to track 'gitlab/main'.

If you refresh the GitLab UI, you should now see your code, as in CI Project Files on GitLab:

Figure 2. CI Project Files on GitLab

Setting up a First Cut of a CI Pipeline

The "Pipeline" terminology was popularised by Dave Farley and Jez Humble in their book Continuous Delivery. The name alludes to the fact that a CI build typically has a series, where the process flows from one to another.

Got to Build → Pipelines, you’ll see a list of example templates. When getting to know a new configuration language, it’s nice to be able to start with something that works, rather than a blank slate.

I chose the Python example template and made a few customisations:

Example 1. .gitlab-ci.yml (ch25l001)

# Use the same image as our Dockerfile
image: python:slim

# These two setting lets us cache pip-installed packages,
# it came from the default template
variables:
  PIP_CACHE_DIR: "$CI_PROJECT_DIR/.cache/pip"
cache:
  paths:
    - .cache/pip

# "setUp" phase, before the main build
before_script:
  - python --version ; pip --version  # For debugging
  - pip install virtualenv
  - virtualenv .venv
  - source .venv/bin/activate

# This is the main build
test:
  script:
    - pip install -r requirements.txt  # (1)
    # unit tests
    - python src/manage.py test lists accounts  # (2)
    # (if those pass) all tests, incl. functional.
    - pip install selenium  # (3)
    - cd src && python manage.py test  # (4)

YAML once again folks!

We start by installing our core requirements
I’ve decided to run the unit tests first. This gives us an "early failure" if there’s any problem at this stage, and saves us from having to run, and more importantly wait for, the Fts to run.
Then we need selenium for the functional tests. Again, I’m delaying this pip install until it’s absolutely necessary, to get feedback as quickly as possible.
And here is a full test run, including the functional tests.

Tip	It’s a good idea in CI pipelines to try and run the quickest tests first, so that you can get feedback as quickly as possible.

You can use the GitLab web UI to edit your pipeline YAML, and then when you save it you can go check for results straight away.

But it is also just a file in your repo! So you can edit it locally. You’ll need to commit it and then git push up to GitLab, and then go check the Jobs section in the Build UI.

$ *git push gitlab

First Build! (and First Failure)

However you click through the UI and you should be able to find your way to see the output of the build Job, as in First Build on GitLab:

GitLab UI showing the output of the first build

Figure 3. First Build on GitLab

Here’s a selection of what I saw in the output console:

Running with gitlab-runner 17.7.0~pre.103.g896916a8 (896916a8)
  on green-1.saas-linux-small-amd64.runners-manager.gitlab.com/default
  JLgUopmM, system ID: s_deaa2ca09de7
Preparing the "docker+machine" executor 00:20
Using Docker executor with image python:latest ...
Pulling docker image python:latest ...
[...]
$ python src/manage.py test lists accounts
Creating test database for alias 'default'...
Found 55 test(s).
System check identified no issues (0 silenced).
................../builds/hjwp/book-example/.venv/lib/python3.13/site-packages/django/core/handlers/base.py:61: UserWarning: No directory at: /builds/hjwp/book-example/src/static/
  mw_instance = middleware(adapted_handler)
.....................................
 ---------------------------------------------------------------------
Ran 55 tests in 0.129s
OK
Destroying test database for alias 'default'...
$ pip install selenium
Collecting selenium
  Using cached selenium-4.28.1-py3-none-any.whl.metadata (7.1 kB)
Collecting urllib3<3,>=1.26 (from urllib3[socks]<3,>=1.26->selenium)
[...]
Successfully installed attrs-25.1.0 certifi-2025.1.31 h11-0.14.0 idna-3.10 outcome-1.3.0.post0 pysocks-1.7.1 selenium-4.28.1 sniffio-1.3.1 sortedcontainers-2.4.0 trio-0.29.0 trio-websocket-0.12.1 typing_extensions-4.12.2 urllib3-2.3.0 websocket-client-1.8.0 wsproto-1.2.0
$ cd src && python manage.py test
Creating test database for alias 'default'...
Found 63 test(s).
System check identified no issues (0 silenced).
......../builds/hjwp/book-example/.venv/lib/python3.13/site-packages/django/core/handlers/base.py:61: UserWarning: No directory at: /builds/hjwp/book-example/src/static/
  mw_instance = middleware(adapted_handler)
...............................................EEEEEEEE
======================================================================
ERROR: test_layout_and_styling (functional_tests.test_layout_and_styling.LayoutAndStylingTest.test_layout_and_styling)
 ---------------------------------------------------------------------
Traceback (most recent call last):
  File "/builds/hjwp/book-example/src/functional_tests/base.py", line 30, in setUp
    self.browser = webdriver.Firefox()
                   ~~~~~~~~~~~~~~~~~^^

[...]
selenium.common.exceptions.WebDriverException: Message: Process unexpectedly closed with status 255
 ---------------------------------------------------------------------
Ran 63 tests in 8.658s
FAILED (errors=8)

selenium.common.exceptions.WebDriverException: Message: Process unexpectedly closed with status 255

You can see we got through the unit tests, and then in the full test run we have 8 errors out of 63 tests. The FTs are all failing.

I’m "lucky" because I’ve done this sort of thing many times before, so I know what to expect: it’s failing because Firefox isn’t installed in the image we’re using.

Let’s modify the script, and an apt install. Again we’ll do it as late as possible.

Example 2. .gitlab-ci.yml (ch25l002)

# This is the main build
test:
  script:
    - pip install -r requirements.txt
    # unit tests
    - python src/manage.py test lists accounts
    # (if those pass) all tests, incl. functional.
    - apt update -y && apt install -y firefox-esr  # (1)
    - pip install selenium
    - cd src && python manage.py test

We use the Debian Linux apt package manager to install Firefox. firefox-esr is the "extended support release", which is a more stable version of Firefox to test against.

If you run it again, and wait a bit, you’ll see we get a slightly different failure:

$ apt-get update -y && apt-get install -y firefox-esr
Get:1 http://deb.debian.org/debian bookworm InRelease [151 kB]
Get:2 http://deb.debian.org/debian bookworm-updates InRelease [55.4 kB]
Get:3 http://deb.debian.org/debian-security bookworm-security InRelease [48.0 kB]
[...]
The following NEW packages will be installed:
  adwaita-icon-theme alsa-topology-conf alsa-ucm-conf at-spi2-common
  at-spi2-core dbus dbus-bin dbus-daemon dbus-session-bus-common
  dbus-system-bus-common dbus-user-session dconf-gsettings-backend
  dconf-service dmsetup firefox-esr fontconfig fontconfig-config
[...]
Get:117 http://deb.debian.org/debian-security bookworm-security/main amd64
firefox-esr amd64 128.7.0esr-1~deb12u1 [69.8 MB]
[...]
Selecting previously unselected package firefox-esr.
Preparing to unpack .../105-firefox-esr_128.7.0esr-1~deb12u1_amd64.deb ...
Adding 'diversion of /usr/bin/firefox to /usr/bin/firefox.real by firefox-esr'
Unpacking firefox-esr (128.7.0esr-1~deb12u1) ...
[...]
Setting up firefox-esr (128.7.0esr-1~deb12u1) ...
update-alternatives: using /usr/bin/firefox-esr to provide
/usr/bin/x-www-browser (x-www-browser) in auto mode
[...]
======================================================================
ERROR: test_multiple_users_can_start_lists_at_different_urls
(functional_tests.test_simple_list_creation.NewVisitorTest.test_multiple_users_can_start_lists_at_different_urls)
 ---------------------------------------------------------------------
Traceback (most recent call last):
  File "/builds/hjwp/book-example/src/functional_tests/base.py", line 30, in setUp
    self.browser = webdriver.Firefox()
                   ~~~~~~~~~~~~~~~~~^^
[...]
selenium.common.exceptions.WebDriverException: Message: Process unexpectedly
closed with status 1
 ---------------------------------------------------------------------
Ran 63 tests in 3.654s
FAILED (errors=8)

We can see Firefox installing OK, but we still get an error. This time it’s exit code 1.

Trying to reproduce a CI error locally

The cycle of "change .gitlab-ci.yml, push, wait for a build, check results" is painfully slow.

Let’s see if we can reproduce this error locally.

To reproduce the CI environment locally, I put together a quick Dockerfile, by copy-pasting the steps in the script section, and prefixing them with RUN commands:

Example 3. infra/Dockerfile.ci (ch25l003)

FROM python:slim

RUN pip install virtualenv
RUN virtualenv .venv

# this won't work
# RUN source .venv/bin/activate
# use full path to venv instead.

COPY requirements.txt requirements.txt
RUN .venv/bin/pip install -r requirements.txt
RUN apt update -y && apt install -y firefox-esr
RUN .venv/bin/pip install selenium

COPY infra/debug-ci.py debug-ci.py
CMD .venv/bin/python debug-ci.py

And let’s add a little debug script at debug-ci.py:

Example 4. infra/debug-ci.py (ch25l004)

from selenium import webdriver

# just try to open a selenium session
webdriver.Firefox().quit()

We build and run it like this:

$ docker build -f infra/Dockerfile.ci -t debug-ci . && \
  docker run -it debug-ci
[...]
 => [internal] load build definition from infra/Dockerfile.ci         0.0s
 => => transferring dockerfile: [...]
 => [internal] load metadata for docker.io/library/python:slim [...]
 => [1/8] FROM docker.io/library/python:slim@sha256:[...]
 => CACHED [2/8] RUN pip install virtualenv                           0.0s
 => CACHED [3/8] RUN virtualenv .venv                                 0.0s
 => CACHED [4/8] COPY requirements.txt requirements.txt               0.0s
 => CACHED [5/8] RUN .venv/bin/pip install -r requirements.txt        0.0s
 => CACHED [6/8] RUN apt update -y && apt install -y firefox-esr      0.0s
 => CACHED [7/8] RUN .venv/bin/pip install selenium                   0.0s
 => [8/8] COPY infra/debug-ci.py debug-ci.py                          0.0s
 => exporting to image                                                0.0s
 => => exporting layers                                               0.0s
 => => writing image sha256:[...]
 => => naming to docker.io/library/debug-ci                           0.0s
Traceback (most recent call last):
  File
  "//.venv/lib/python3.13/site-packages/selenium/webdriver/common/driver_finder.py",
  line 67, in _binary_paths
    output = SeleniumManager().binary_paths(self._to_args())
[...]
selenium.common.exceptions.WebDriverException: Message: Unsupported
platform/architecture combination: linux/aarch64

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "//debug-ci.py", line 4, in <module>
    webdriver.Firefox().quit()
    ~~~~~~~~~~~~~~~~~^^
[...]
selenium.common.exceptions.NoSuchDriverException: Message: Unable to obtain
driver for firefox; For documentation on this error, please visit:
https://www.selenium.dev/documentation/webdriver/troubleshooting/errors/driver_location

You might not see this—that "Unsupported platform/architecture combination" error is spurious, it’s because I was on a Mac. Let’s try again with:

$ docker build -f infra/Dockerfile.ci -t debug-ci --platform=linux/amd64 . && \
  docker run --platform=linux/amd64 -it debug-ci
[...]
Traceback (most recent call last):
  File "//debug-ci.py", line 4, in <module>
    webdriver.Firefox().quit()
[...]
selenium.common.exceptions.WebDriverException: Message: Process unexpectedly
closed with status 1

OK, that’s a repro of our issue. But no further clues yet!

Enabling Debug Logs for Selenium / Firefox / Webdriver

Getting debug information out of Selenium can be a bit fiddly. I tried two avenues, setting options and setting the service, the former of which doesn’t really work as far as I can tell, but the latter does. There is some limited info in the Selenium docs.

Example 5. infra/debug-ci.py (ch25l005)

import subprocess

from selenium import webdriver

options = webdriver.FirefoxOptions()  # (1)
options.log.level = "trace"

service = webdriver.FirefoxService(  # (2)
    log_output=subprocess.STDOUT, service_args=["--log", "trace"]
)

# just try to open a selenium session
webdriver.Firefox(options=options, service=service).quit()

This is how I attempted to increase the log level using options. I had to reverse-engineer it from the source code, and it doesn’t seem to work anyway, but I thought I"d leave it here for future reference
This is the FirefoxService config class, which does seem to let you print some debug info. I’m configuring it to print to standard-out.

Sure enough we can see some output now!

$ docker build -f infra/Dockerfile.ci -t debug-ci --platform=linux/amd64 . && \
  docker run --platform=linux/amd64 -it debug-ci
[...]
1234567890111   geckodriver     INFO    Listening on 127.0.0.1:XXXX
1234567890112   webdriver::server       DEBUG   -> POST /session
{"capabilities": {"firstMatch": [{}], "alwaysMatch": {"browserName": "firefox",
"acceptInsecureCerts": true, ... , "moz:firefoxOptions": {"binary":
"/usr/bin/firefox", "prefs": {"remote.active-protocols": 3}, "log": {"level":
"trace"}}}}}
1234567890111   geckodriver::capabilities       DEBUG   Trying to read firefox
version from ini files
1234567890111   geckodriver::capabilities       DEBUG   Trying to read firefox
version from binary
1234567890111   geckodriver::capabilities       DEBUG   Found version 128.7esr
1740029792102   mozrunner::runner       INFO    Running command:
MOZ_CRASHREPORTER="1" MOZ_CRASHREPORTER_NO_REPORT="1"
MOZ_CRASHREPORTER_SHUTDOWN="1" [...]
"--remote-debugging-port" [...]
"-no-remote" "-profile" "/tmp/rust_mozprofile[...]
1234567890111   geckodriver::marionette DEBUG   Waiting 60s to connect to
browser on 127.0.0.1
1234567890111   geckodriver::browser    TRACE   Failed to open
/tmp/rust_mozprofile[...]
1234567890111   geckodriver::marionette TRACE   Retrying in 100ms
Error: no DISPLAY environment variable specified
1234567890111   geckodriver::browser    DEBUG   Browser process stopped: exit
status: 1
1234567890112   webdriver::server       DEBUG   <- 500 Internal Server Error
{"value":{"error":"unknown error","message":"Process unexpectedly closed with
status 1","stacktrace":""}}
Traceback (most recent call last):
  File "//debug-ci.py", line 13, in <module>
    webdriver.Firefox(options=options, service=service).quit()
[...]
selenium.common.exceptions.WebDriverException: Message: Process unexpectedly
closed with status 1

Well, it wasn’t immediately obvious what’s going on there, but I did eventually get a clue from the line that says no DISPLAY environment variable specified

Out of curiosity, I thought I’d try running firefox directly:

$ docker build -f infra/Dockerfile.ci -t debug-ci --platform=linux/amd64 . && \
  docker run --platform=linux/amd64 -it debug-ci firefox
[...]
Error: no DISPLAY environment variable specified

Sure enough, the same error.

Enabling Headless Mode for Firefox

And if you search around for this error, you’ll eventually find enough pointers to the answer: Firefox is crashing because it can’t find a display. Servers are "headless", meaning they don’t have a screen. Thankfully Firefox has a headless mode, which we can enable by setting an environment variable, MOZ_HEADLESS.

Let’s confirm that locally. We’ll use the -e flag for docker run:

$ docker build -f infra/Dockerfile.ci -t debug-ci --platform=linux/amd64 . && \
  docker run -e MOZ_HEADLESS=1 --platform=linux/amd64 -it debug-ci
1234567890111   geckodriver     INFO    Listening on 127.0.0.1:43137
[...]
*** You are running in headless mode.
[...]
1234567890112   webdriver::server       DEBUG   Teardown [...]
1740030525996   Marionette      DEBUG   Closed connection 0
1234567890111   geckodriver::browser    DEBUG   Browser process stopped: exit
status: 0
1234567890112   webdriver::server       DEBUG   <- 200 OK [...]

It takes quite a long time to run, and there’s lots of debug out, but… it looks OK!

Let’s set that environment variable in our CI script:

Example 6. .gitlab-ci.yml (ch25l006)

variables:
  # Put pip-cache in home folder so we can use gitlab cache
  PIP_CACHE_DIR: "$CI_PROJECT_DIR/.cache/pip"
  # Make Firefox run headless.
  MOZ_HEADLESS: "1"

Tip	Using a local Docker image to repro the CI environment is a hint that it might be worth investing time in running CI in a custom Docker image that you fully control; this is an another way of improving reproducibility. We won’t have time to go into detail in this book though.

A Common Bugbear: Flaky tests

That worked! or at least it almost did. All but one of the FTs passed for me, but there was one unexpected error:

+ python manage.py test functional_tests
......F.
======================================================================
FAIL: test_can_start_a_todo_list
(functional_tests.test_simple_list_creation.NewVisitorTest)
 ---------------------------------------------------------------------
Traceback (most recent call last):
  File "...goat-book/functional_tests/test_simple_list_creation.py", line
38, in test_can_start_a_todo_list
    self.wait_for_row_in_list_table('2: Use peacock feathers to make a fly')
  File "...goat-book/functional_tests/base.py", line 51, in
wait_for_row_in_list_table
    raise e
  File "...goat-book/functional_tests/base.py", line 47, in
wait_for_row_in_list_table
    self.assertIn(row_text, [row.text for row in rows])
AssertionError: '2: Use peacock feathers to make a fly' not found in ['1: Buy
peacock feathers']
 ---------------------------------------------------------------------

Now you might not see this error, but it’s common for the switch to CI to flush out some "flaky" tests, things that will fail intermittently. In CI a common cause is the "noisy neighbour" problem, where the CI server might be much slower than your own machine, thus flushing out some race conditions, or in this case, just randomly hanging for a few seconds, taking us past the default timeout.

Let’s give ourselves some tools to help debug though.

Taking Screenshots

To be able to debug unexpected failures that happen on a remote server, it would be good to see a picture of the screen at the moment of the failure, and maybe also a dump of the HTML of the page.

We can do that using some custom logic in our FT class tearDown. We’ll need to do a bit of introspection of unittest internals, a private attribute called ._outcome, but this will work:

Example 7. src/functional_tests/base.py (ch25l007)

import os
import time
from datetime import datetime
from pathlib import Path
[...]
MAX_WAIT = 5

SCREEN_DUMP_LOCATION = Path(__file__).absolute().parent / "screendumps"
[...]

    def tearDown(self):
        if self._test_has_failed():
            if not SCREEN_DUMP_LOCATION.exists():
                SCREEN_DUMP_LOCATION.mkdir(parents=True)
            self.take_screenshot()
            self.dump_html()
        self.browser.quit()
        super().tearDown()

    def _test_has_failed(self):
        # slightly obscure but couldn't find a better way!
        return self._outcome.result.failures or self._outcome.result.errors

We first create a directory for our screenshots if necessary. Then we iterate through all the open browser tabs and pages, and use a Selenium methods, get_screenshot_as_file() and the attribute browser.page_source, for our image and HTML dumps, respectively:

Example 8. src/functional_tests/base.py (ch25l008)

    def take_screenshot(self):
        path = SCREEN_DUMP_LOCATION / self._get_filename("png")
        print("screenshotting to", path)
        self.browser.get_screenshot_as_file(str(path))

    def dump_html(self):
        path = SCREEN_DUMP_LOCATION / self._get_filename("html")
        print("dumping page HTML to", path)
        path.write_text(self.browser.page_source)

And finally here’s a way of generating a unique filename identifier, which includes the name of the test and its class, as well as a timestamp:

Example 9. src/functional_tests/base.py (ch25l009)

    def _get_filename(self, extension):
        timestamp = datetime.now().isoformat().replace(":", ".")[:19]
        return (
            f"{self.__class__.__name__}.{self._testMethodName}-{timestamp}.{extension}"
        )

You can test this first locally by deliberately breaking one of the tests, with a self.fail() for example, and you’ll see something like this:

$ ./src/manage.py test functional_tests.test_my_lists
[...]
Fscreenshotting to ...goat-book/src/functional_tests/screendumps/MyListsTest.te
st_logged_in_users_lists_are_saved_as_my_lists-[...]
dumping page HTML to ...goat-book/src/functional_tests/screendumps/MyListsTest.
test_logged_in_users_lists_are_saved_as_my_lists-[...]
Fscreenshotting to ...goat-book/src/functional_tests/screendumps/MyListsTest.te
st_logged_in_users_lists_are_saved_as_my_lists-2025-02-18T11.29.00.png
dumping page HTML to ...goat-book/src/functional_tests/screendumps/MyListsTest.
test_logged_in_users_lists_are_saved_as_my_lists-2025-02-18T11.29.00.html

Saving Build Outputs (or Debug Files) as Artifacts

We also need to tell gitlab to "save" these files for us to be able to actually look at them This is called "artifacts":

Example 10. .gitlab-ci.yml (ch25l012)

test:
  [...]

  script:
    [...]

  artifacts: # (1)
    when: always  # (2)
    paths: # (1)
      - src/functional_tests/screendumps/

artifacts is the name of the key, and the paths argument is fairly self-explanatory. You can use wildcards here, more info in the GitLab docs.
One thing the docs didn’t make obvious is that you need when: always because otherwise it won’t save artifacts for failed jobs. That was annoyingly hard to figure out!

In any case that should work. If you commit the code and then push it back to Gitlab, we should be able to see a new build job.

$ echo "src/functional_tests/screendumps" >> .gitignore
$ git commit -am "add screenshot on failure to FT runner"
$ git push

In its output, we’ll see the screenshots and html dumps being saved:

screendumps/LoginTest.test_can_get_email_link_to_log_in-window0-2014-01-22T17.45.12.html
Fscreenshotting to /builds/hjwp/book-example/src/functional_tests/screendumps/NewVisitorTest.test_can_start_a_todo_list-2025-02-17T17.51.01.png
dumping page HTML to /builds/hjwp/book-example/src/functional_tests/screendumps/NewVisitorTest.test_can_start_a_todo_list-2025-02-17T17.51.01.html
Not Found: /favicon.ico
.screenshotting to /builds/hjwp/book-example/src/functional_tests/screendumps/NewVisitorTest.test_multiple_users_can_start_lists_at_different_urls-2025-02-17T17.51.06.png
dumping page HTML to /builds/hjwp/book-example/src/functional_tests/screendumps/NewVisitorTest.test_multiple_users_can_start_lists_at_different_urls-2025-02-17T17.51.06.html
======================================================================
FAIL: test_can_start_a_todo_list (functional_tests.test_simple_list_creation.NewVisitorTest.test_can_start_a_todo_list)
[...]

And to the right some new UI options appear to Browse the artifacts, as in Artifacts Appear on the Right of the Build Job.

GitLab UI tab showing the option to browse artifacts

Figure 4. Artifacts Appear on the Right of the Build Job

And if you navigate through, you’ll see something like Our Screenshot in the GitLab UI, Looking Unremarkable:

GitLab UI showing a normal-looking screenshot of the site

Figure 5. Our Screenshot in the GitLab UI, Looking Unremarkable

If in Doubt, Try Bumping the Timeout!

Hm. No obvious clues there. Well, when in doubt, bump the timeout, as the old adage goes:

Example 11. src/functional_tests/base.py

MAX_WAIT = 10

Then we can rerun the build by pushing, and confirm it now works,

A Successful Python Test Run

At this point we should get a working pipeline, A Successful GitLab Pipeline:

GitLab UI showing a successful pipeline run

Figure 6. A Successful GitLab Pipeline

Running Our JavaScript Tests in CI

There’s a set of tests we almost forgot—the JavaScript tests. Currently our "test runner" is an actual web browser. To get them running in CI, we need a command-line test runner.

Note	Our JavaScript tests currently test the interaction between our code and the bootstrap framework/CSS, so we still need a real browser to be able to make our visibility checks work.

Thankfully, the Jasmine docs point us straight towards the kind of tool we need: Jasmine Browser Runner.

Installing node

It’s time to stop pretending we’re not in the JavaScript game. We’re doing web development. That means we do JavaScript. That means we’re going to end up with node.js on our computers. It’s just the way it has to be.

Follow the instructions on the node.js homepage, and follow the instructions there. It should guide you through installing the "node version manager" (NVM), and then to getting the latest version of node.

$ nvm install 22  # or whichver the latest version is
Installing Node v22.14.0 (arm64)
[...]
$ node -v
v22.14.0

Installing and Configuring the Jasmine Browser Runner

The docs suggest we install it like this, and then run the init command to generate a default config file:

$ cd src/lists/static

$ npm install --save-dev jasmine-browser-runner jasmine-core
[...]
added 151 packages in 4s

$ cat package.json  # this is the equivalent of requirement.txt
{
  "devDependencies": {
    "jasmine-browser-runner": "^3.0.0",
    "jasmine-core": "^5.6.0"
  }
}

$ ls node_modules/
# will show several dozen directories

$ npx jasmine-browser-runner init
Wrote configuration to spec/support/jasmine-browser.mjs.

Well we now have about a million files in node_modules/ (which is JavaScript’s verions of a virtualenv essentially), and we also have a new config file in spec/support/jasmine-browser.mjs.

That’s not the ideal place, because we’ve said our tests live in a folder called tests, so let’s move the config file in there.

$ mv spec/support/jasmine-browser.mjs tests/jasmine-browser-runner.config.mjs
$ rm -rf spec

Then let’s edit it slightly, to specify a few things correctly:

Example 12. src/lists/static/tests/jasmine-browser-runner.config.mjs (ch25l013)

export default {
  srcDir: ".",  // (1)
  srcFiles: [
    "*.js"
  ],
  specDir: "tests",  // (2)
  specFiles: [
    "**/*[sS]pec.js"
  ],
  helpers: [
    "helpers/**/*.js"
  ],
  env: {
    stopSpecOnExpectationFailure: false,
    stopOnSpecFailure: false,
    random: true,
    forbidDuplicateNames: true
  },
  listenAddress: "localhost",
  hostname: "localhost",
  browser: {
    name: "headlessFirefox"  // (3)
  }
};

our source files are in the current directory, src/lists/static, ie lists.js
and our spec files are in tests/
And here we say we want to use the "headless" version of Firefox. (we could have done this by setting MOZ_HEADLESS at the command line again, but this saves us from having to remember).

Let’s try running it now. We use the --config option to path it the now non-standard path to the config file:

$ npx jasmine-browser-runner runSpecs --config=tests/jasmine-browser-runner.config.mjs
Jasmine server is running here: http://localhost:62811
Jasmine tests are here:         ...goat-book/src/lists/static/tests
Source files are here:          ...goat-book/src/lists/static
Running tests in the browser...
Randomized with seed 17843
Started
.F.

Failures:
1) Superlists tests error message should be hidden on input
  Message:
    Expected true to be false.
  Stack:
    <Jasmine>
    @http://localhost:62811/spec/Spec.js:46:40
    <Jasmine>

3 specs, 1 failure
Finished in 0.014 seconds
Randomized with seed 17843 (jasmine-browser-runner runSpecs --seed=17843)

Could be worse! 1 failure out of 3 specs.

Unfortunately, it’s the most important test:

Example 13. src/lists/static/tests/Spec.js

  it("error message should be hidden on input", () => {
    initialize(inputSelector);
    textInput.dispatchEvent(new InputEvent("input"));

    expect(errorMsg.checkVisibility()).toBe(false);
  });

Ah yes, if you remember I said, the whole reason we need to use a browser-based test runner, is because our visibility checks depend on the bootstrap CSS framework?

In the HTML spec runner which we’d configured so far, we load Bootstrap using a <link> tag:

Example 14. src/lists/static/tests/SpecRunner.html

  <!-- Bootstrap CSS -->
  <link href="../bootstrap/css/bootstrap.min.css" rel="stylesheet">

And here’s how we load it for jasmine-browser-runner:

Example 15. src/lists/static/tests/jasmine-browser-runner.config.mjs (ch25l014)

export default {
  srcDir: ".",
  srcFiles: [
    "*.js"
  ],
  specDir: "tests",
  specFiles: [
    "**/*[sS]pec.js"
  ],
  cssFiles: [  // (1)
    "bootstrap/css/bootstrap.min.css"  // (1)
  ],
  helpers: [
    "helpers/**/*.js"
  ],

The cssFiles key is how you tell the runner to load, er, some CSS. I found that out in the docs

Let’s give that a go…

$ npx jasmine-browser-runner runSpecs --config=tests/jasmine-browser-runner.config.mjs
Jasmine server is running here: http://localhost:62901
Jasmine tests are here:         /Users/harry.percival/workspace/Book-TDD-Web-Dev-Python/source/chapter_25_CI/superlists/src/lists/static/tests
Source files are here:          /Users/harry.percival/workspace/Book-TDD-Web-Dev-Python/source/chapter_25_CI/superlists/src/lists/static
Running tests in the browser...
Randomized with seed 06504
Started
...


3 specs, 0 failures
Finished in 0.016 seconds
Randomized with seed 06504 (jasmine-browser-runner runSpecs --seed=06504)

Hooray! That works locally, let’s get it into CI.

# add the package.json, which saves our node depenencies
$ git add src/lists/static/package.json src/lists/static/package-lock.json
# ignore the node_modules/ directory
$ echo "node_modules/" >> .gitignore
# and our config file
$ git add src/lists/static/tests/jasmine-browser-runner.config.mjs
$ git commit -m "config for node + jasmine-browser-runner for JS tests"

Adding A Build Step for Js

We now want two different build steps, so let’s rename test to test-python and move all its specific bits like variables and before_script inside it, and then create a separate step calleed test-js, with a similar structure:

Example 16. .gitlab-ci.yml (ch25l018)

test-python:
  # Use the same image as our Dockerfile
  image: python:slim  # (1)

  variables:  # (1)
    # Put pip-cache in home folder so we can use gitlab cache
    PIP_CACHE_DIR: "$CI_PROJECT_DIR/.cache/pip"
    # Make Firefox run headless.
    MOZ_HEADLESS: "1"

  cache:  # (1)
    paths:
      - .cache/pip

  # "setUp" phase, before the main build
  before_script:  # (1)
    - python --version ; pip --version  # For debugging
    - pip install virtualenv
    - virtualenv .venv
    - source .venv/bin/activate

  script:
    - pip install -r requirements.txt
    # unit tests
    - python src/manage.py test lists accounts
    # (if those pass) all tests, incl. functional.
    - apt update -y && apt install -y firefox-esr
    - pip install selenium
    - cd src && python manage.py test

  artifacts:
    when: always
    paths:
      - src/functional_tests/screendumps/

test-js:  # (2)
  image: node:slim
  script:
    - apt update -y && apt install -y firefox-esr  # (3)
    - cd src/lists/static
    - npm install  # (4)
    - npx jasmine-browser-runner runSpecs
      --config=tests/jasmine-browser-runner.config.mjs  # (5)

image, variables, cache, and before_script all move out of the top level and into the test-python step, since they’re all specific to this step only, now.
Here’s our new step, test-js.
We install Firefox into the node image, just like we do for the Python one.
We don’t need to specify what to npm install, because that’s all in the package-lock.json file.
And here’s our command to run the tests.

And slap me over the head with a wet fish if that doesn’t pass first go! See Wow, There are Those JS Tests, Passing on the First Attempt! for a successful pipeline run.

GitLab UI showing a successful pipeline run with JS tests

Figure 7. Wow, There are Those JS Tests, Passing on the First Attempt!

Tests now pass

And there we are! A complete CI build featuring all of our tests! Here are Both Our Jobs in all their Green Glory:

GitLab UI the pipeline overview, with both build jobs green

Figure 8. Here are Both Our Jobs in all their Green Glory

Nice to know that, no matter how lazy I get about running the full test suite on my own machine, the CI server will catch me. Another one of the Testing Goat’s agents in cyberspace, watching over us…

I’ve moved it to an appendix tho, cos it’s so gitlab-heavy.

Alternatives: Woodpecker and Forgejo

I want to give a shout out to Woodpecker CI and Forgejo, two of the newer self-hosted CI options. And while I’m at it, to Jenkins, which did a great job for the first and second editions, and still does for many people.

If you want true independence from overly commercial interests, then self-hosted is the way to go. You’ll need your own server for both of these.

I tried both, and managed to get them working within an hour or two. Their documentation is good.

If you do decide to give them a go, I’d say, be a bit cautious about security options.

For example, you might decide you don’t want any old person from the Internet to be able to sign up for an account on your server:

DISABLE_REGISTRATION: true

But more power to you for giving it a go, I say!

Some Things We Didn’t Cover

Defining a Docker image for CI

We spent quite a bit of time debugging, for example the unhelpful messages when Firefox wasn’t installed. Just as we did when preparing our deployment, being able to have an environment that you can run on your local machine that’s as close as possible to what you have remotely, is a big help; that’s why we chose to use Docker image.

In CI our tests also run a Docker image (python:slim and node:slim), so one common pattern is to define a Docker image, in your repo, that you will use for CI. Ideally it would also be as similar as possible to the one you use in production! A typical solution here is to use "multi-stage" Docker builds, with a base stage, a prod stage, and a dev/ci stage. In our case, that latter would have Firefox, selenium, and other test-only dependencies in it, that we don’t need for prod.

You can then run your tests locally inside the same Docker image that’s used in CI. This

Tip	Reproducibility is one of the key attributes we’re aiming for. The more your project grows in complexity, the more it’s worth investing in minimising the differences between local dev, CI, and prod.

Caching

We touched on the use of caches in CI for the pip download cache, but as CI pipelines grow in maturity, you’ll find you can make more and more use of caching.

It’s a topic for another time, but this is yet another way of trying to speed up the feedback cycle.

Automated Deployment, aka Continuous Delivery (CD)

The natural next step is to finish our journey into automation, and set up a pipeline that will deploy our code all the way to production, each time we push code… as long as the tests pass!

I work through an example of how to do that in [appendix_CD]. I definitely encourage you to take a look.

Now, onto our last chapter of coding, everyone!

Best Practices for CI (including Selenium Tips)

Set up CI as soon as possible for your project: As soon as your functional tests take more than a few seconds to run, you’ll find yourself avoiding running them all. Give this job to a CI server, to make sure that all your tests are getting run somewhere.
Optimise for fast feedback: CI feedback loops can be frustratingly slow. Optimising things to get results quicker is worth the effort. Run your fastest tests first, and try to minimise time spent on, eg, dependency installation, by using caches.
Set up screenshots and HTML dumps for failures: Debugging test failures is easier if you can see what the page looked like when the failure occurred. This is particularly useful for debugging CI failures, but it’s also very useful for tests that you run locally.
Be prepared to bump your timeouts: A CI server may not be as speedy as your laptop, especially if it’s under load, running multiple tests at the same time. Be prepared to be even more generous with your timeouts, in order to minimise the chance of random failures.
Take the next step, CD (Continuous Delivery): Once we’re running tests automatically, we can take the next step which is to automated our deployments (when the tests pass). See [appendix_CD] for a worked example.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chapter_25_CI.asciidoc

chapter_25_CI.asciidoc

CI: Continuous Integration

Choosing a CI service

Getting our Code into GitLab

Starting a Project

Pushing our Code up using Git Push

Setting up a First Cut of a CI Pipeline

First Build! (and First Failure)

Trying to reproduce a CI error locally

Enabling Debug Logs for Selenium / Firefox / Webdriver

Enabling Headless Mode for Firefox

A Common Bugbear: Flaky tests

Taking Screenshots

Saving Build Outputs (or Debug Files) as Artifacts

If in Doubt, Try Bumping the Timeout!

A Successful Python Test Run

Running Our JavaScript Tests in CI

Installing node

Installing and Configuring the Jasmine Browser Runner

Adding A Build Step for Js

Tests now pass

Some Things We Didn’t Cover

Defining a Docker image for CI

Caching

Automated Deployment, aka Continuous Delivery (CD)

Files

chapter_25_CI.asciidoc

Latest commit

History

chapter_25_CI.asciidoc

File metadata and controls

CI: Continuous Integration

Choosing a CI service

Getting our Code into GitLab

Starting a Project

Pushing our Code up using Git Push

Setting up a First Cut of a CI Pipeline

First Build! (and First Failure)

Trying to reproduce a CI error locally

Enabling Debug Logs for Selenium / Firefox / Webdriver

Enabling Headless Mode for Firefox

A Common Bugbear: Flaky tests

Taking Screenshots

Saving Build Outputs (or Debug Files) as Artifacts

If in Doubt, Try Bumping the Timeout!

A Successful Python Test Run

Running Our JavaScript Tests in CI

Installing node

Installing and Configuring the Jasmine Browser Runner

Adding A Build Step for Js

Tests now pass

Some Things We Didn’t Cover

Defining a Docker image for CI

Caching

Automated Deployment, aka Continuous Delivery (CD)