This chapter has had a big rewrite!
As always, feedback is welcome, but especially-especially since this stuff is all so new. Let me know how you get on :)
As our site grows, it takes longer and longer to run all of our functional tests. If this continues, the danger is that we’re going to stop bothering.
Rather than let that happen, we can automate the running of functional tests by setting up "Continuous Integration", or CI. That way, in day-to-day development, we can just run the FT that we’re working on at that time, and rely on CI to run all the other tests automatically and let us know if we’ve broken anything accidentally.
The unit tests should stay fast enough that we can keep running the full suite locally, every few seconds.
Note
|
Continuous Integration is another practice that was popularised by Kent Beck’s Extreme Programming (XP) movement in the 1990s. |
As we’ll see, one of the great frustrations of CI is that the feedback loop is so much slower than working locally. As we go along, we’ll look for ways to optimise for that, where we can.
While debugging, we’ll also touch on the theme of reproducibility, the idea that we want to be able to reproduce behaviours of our CI environment locally, in the same way we try and make our production and dev environments as similar as possible.
CI as a concept has been around since the 90s at least, and traditionally would be done by configuring a server, perhaps under a desk in the corner of the office, with software on it that could pull down all the code from the main branch at the end of each day, and scripts to compile all the code and run all the tests, a process that became know as a "build". Then each morning developers would take a look at the results, and deal with any "broken builds".
As the practice has spread, and feedback cycles have grown faster, CI software has developed (Jenkins was a popular choice in the 2010s), and CI has become a common "cloud" service, designed to integrate with code hosting providers like GitHub, or even provided directly by the same providers: GitHub has "GitHub Actions", and because of its salience, it’s probably the most popular choice for Open Source projects these days. In a corporate environment, you might come across other solutions like CircleCI, Travis, and GitLab.
It is still absolutely possible to download and self-host your own CI server; in the 1e and 2e I demonstrated the use of Jenkins. But, the installation and subsequent admin/maintenance burden is not effort-free, so for this edition I wanted to pick a service that was as much like the kind of thing you’re likely to encounter in your day job, while trying not to endorse the largest commercial provider. There’s nothing wrong with GitHub Actions! They just don’t need any more help dominating the market.
So I’ve decided to use GitLab in this book. They are absolutely a commercial service, but they retain an open-source version, and you can self-host it in theory.
The syntax (it’s always YAML…) and core concepts are common across all providers, so the things you learn here will be replicable in whichever service you encounter in future.
Like most of the services out there, GitLab has a free tier, which will work fine for our purposes.
GitLab is primarily a code hosting service like GitHub, so the first thing thing to do is get our code up there
Use the New Project → Create Blank Project option, as in Creating a New Repo on Gitlab.
First we set up gitlab as a "remote" for our project:
# substitute your username and project name as necessary $ git remote add origin [email protected]:yourusername/superlists.git # or, if you already have "origin" defined: $ git remote add gitlab [email protected]:yourusername/superlists.git $ git remote -v gitlab [email protected]:hjwp/superlistz.git (fetch) gitlab [email protected]:hjwp/superlistz.git (push) origin [email protected]:hjwp/book-example.git (fetch) origin [email protected]:hjwp/book-example.git (push)
Now we can push up our code with git push
.
The -u
flag sets up a "remote-tracking" branch,
so you can push+pull without explicitly specifying the remote + branch
# if using 'origin', it will be the default: $ git push -u # or, if you have multiple remotes: $ git push -u gitlab Enumerating objects: 706, done. Counting objects: 100% (706/706), done. Delta compression using up to 11 threads Compressing objects: 100% (279/279), done. Writing objects: 100% (706/706), 918.72 KiB | 131.25 MiB/s, done. Total 706 (delta 413), reused 682 (delta 408), pack-reused 0 (from 0) remote: Resolving deltas: 100% (413/413), done. To gitlab.com:hjwp/superlistz.git * [new branch] main -> main branch 'main' set up to track 'gitlab/main'.
If you refresh the GitLab UI, you should now see your code, as in CI Project Files on GitLab:
The "Pipeline" terminology was popularised by Dave Farley and Jez Humble in their book Continuous Delivery. The name alludes to the fact that a CI build typically has a series, where the process flows from one to another.
Got to Build → Pipelines, you’ll see a list of example templates. When getting to know a new configuration language, it’s nice to be able to start with something that works, rather than a blank slate.
I chose the Python example template and made a few customisations:
# Use the same image as our Dockerfile
image: python:slim
# These two setting lets us cache pip-installed packages,
# it came from the default template
variables:
PIP_CACHE_DIR: "$CI_PROJECT_DIR/.cache/pip"
cache:
paths:
- .cache/pip
# "setUp" phase, before the main build
before_script:
- python --version ; pip --version # For debugging
- pip install virtualenv
- virtualenv .venv
- source .venv/bin/activate
# This is the main build
test:
script:
- pip install -r requirements.txt # (1)
# unit tests
- python src/manage.py test lists accounts # (2)
# (if those pass) all tests, incl. functional.
- pip install selenium # (3)
- cd src && python manage.py test # (4)
YAML once again folks!
-
We start by installing our core requirements
-
I’ve decided to run the unit tests first. This gives us an "early failure" if there’s any problem at this stage, and saves us from having to run, and more importantly wait for, the Fts to run.
-
Then we need selenium for the functional tests. Again, I’m delaying this
pip install
until it’s absolutely necessary, to get feedback as quickly as possible. -
And here is a full test run, including the functional tests.
Tip
|
It’s a good idea in CI pipelines to try and run the quickest tests first, so that you can get feedback as quickly as possible. |
You can use the GitLab web UI to edit your pipeline YAML, and then when you save it you can go check for results straight away.
But it is also just a file in your repo!
So you can edit it locally.
You’ll need to commit it and then git push
up to GitLab,
and then go check the Jobs section in the Build UI.
$ *git push gitlab
However you click through the UI and you should be able to find your way to see the output of the build Job, as in First Build on GitLab:
Here’s a selection of what I saw in the output console:
Running with gitlab-runner 17.7.0~pre.103.g896916a8 (896916a8) on green-1.saas-linux-small-amd64.runners-manager.gitlab.com/default JLgUopmM, system ID: s_deaa2ca09de7 Preparing the "docker+machine" executor 00:20 Using Docker executor with image python:latest ... Pulling docker image python:latest ... [...] $ python src/manage.py test lists accounts Creating test database for alias 'default'... Found 55 test(s). System check identified no issues (0 silenced). ................../builds/hjwp/book-example/.venv/lib/python3.13/site-packages/django/core/handlers/base.py:61: UserWarning: No directory at: /builds/hjwp/book-example/src/static/ mw_instance = middleware(adapted_handler) ..................................... --------------------------------------------------------------------- Ran 55 tests in 0.129s OK Destroying test database for alias 'default'... $ pip install selenium Collecting selenium Using cached selenium-4.28.1-py3-none-any.whl.metadata (7.1 kB) Collecting urllib3<3,>=1.26 (from urllib3[socks]<3,>=1.26->selenium) [...] Successfully installed attrs-25.1.0 certifi-2025.1.31 h11-0.14.0 idna-3.10 outcome-1.3.0.post0 pysocks-1.7.1 selenium-4.28.1 sniffio-1.3.1 sortedcontainers-2.4.0 trio-0.29.0 trio-websocket-0.12.1 typing_extensions-4.12.2 urllib3-2.3.0 websocket-client-1.8.0 wsproto-1.2.0 $ cd src && python manage.py test Creating test database for alias 'default'... Found 63 test(s). System check identified no issues (0 silenced). ......../builds/hjwp/book-example/.venv/lib/python3.13/site-packages/django/core/handlers/base.py:61: UserWarning: No directory at: /builds/hjwp/book-example/src/static/ mw_instance = middleware(adapted_handler) ...............................................EEEEEEEE ====================================================================== ERROR: test_layout_and_styling (functional_tests.test_layout_and_styling.LayoutAndStylingTest.test_layout_and_styling) --------------------------------------------------------------------- Traceback (most recent call last): File "/builds/hjwp/book-example/src/functional_tests/base.py", line 30, in setUp self.browser = webdriver.Firefox() ~~~~~~~~~~~~~~~~~^^ [...] selenium.common.exceptions.WebDriverException: Message: Process unexpectedly closed with status 255 --------------------------------------------------------------------- Ran 63 tests in 8.658s FAILED (errors=8) selenium.common.exceptions.WebDriverException: Message: Process unexpectedly closed with status 255
You can see we got through the unit tests, and then in the full test run we have 8 errors out of 63 tests. The FTs are all failing.
I’m "lucky" because I’ve done this sort of thing many times before, so I know what to expect: it’s failing because Firefox isn’t installed in the image we’re using.
Let’s modify the script, and an apt install
.
Again we’ll do it as late as possible.
# This is the main build
test:
script:
- pip install -r requirements.txt
# unit tests
- python src/manage.py test lists accounts
# (if those pass) all tests, incl. functional.
- apt update -y && apt install -y firefox-esr # (1)
- pip install selenium
- cd src && python manage.py test
-
We use the Debian Linux
apt
package manager to install Firefox.firefox-esr
is the "extended support release", which is a more stable version of Firefox to test against.
If you run it again, and wait a bit, you’ll see we get a slightly different failure:
$ apt-get update -y && apt-get install -y firefox-esr Get:1 http://deb.debian.org/debian bookworm InRelease [151 kB] Get:2 http://deb.debian.org/debian bookworm-updates InRelease [55.4 kB] Get:3 http://deb.debian.org/debian-security bookworm-security InRelease [48.0 kB] [...] The following NEW packages will be installed: adwaita-icon-theme alsa-topology-conf alsa-ucm-conf at-spi2-common at-spi2-core dbus dbus-bin dbus-daemon dbus-session-bus-common dbus-system-bus-common dbus-user-session dconf-gsettings-backend dconf-service dmsetup firefox-esr fontconfig fontconfig-config [...] Get:117 http://deb.debian.org/debian-security bookworm-security/main amd64 firefox-esr amd64 128.7.0esr-1~deb12u1 [69.8 MB] [...] Selecting previously unselected package firefox-esr. Preparing to unpack .../105-firefox-esr_128.7.0esr-1~deb12u1_amd64.deb ... Adding 'diversion of /usr/bin/firefox to /usr/bin/firefox.real by firefox-esr' Unpacking firefox-esr (128.7.0esr-1~deb12u1) ... [...] Setting up firefox-esr (128.7.0esr-1~deb12u1) ... update-alternatives: using /usr/bin/firefox-esr to provide /usr/bin/x-www-browser (x-www-browser) in auto mode [...] ====================================================================== ERROR: test_multiple_users_can_start_lists_at_different_urls (functional_tests.test_simple_list_creation.NewVisitorTest.test_multiple_users_can_start_lists_at_different_urls) --------------------------------------------------------------------- Traceback (most recent call last): File "/builds/hjwp/book-example/src/functional_tests/base.py", line 30, in setUp self.browser = webdriver.Firefox() ~~~~~~~~~~~~~~~~~^^ [...] selenium.common.exceptions.WebDriverException: Message: Process unexpectedly closed with status 1 --------------------------------------------------------------------- Ran 63 tests in 3.654s FAILED (errors=8)
We can see Firefox installing OK, but we still get an error. This time it’s exit code 1.
The cycle of "change .gitlab-ci.yml, push, wait for a build, check results" is painfully slow.
Let’s see if we can reproduce this error locally.
To reproduce the CI environment locally, I put together a quick Dockerfile,
by copy-pasting the steps in the script
section, and prefixing them with RUN
commands:
FROM python:slim
RUN pip install virtualenv
RUN virtualenv .venv
# this won't work
# RUN source .venv/bin/activate
# use full path to venv instead.
COPY requirements.txt requirements.txt
RUN .venv/bin/pip install -r requirements.txt
RUN apt update -y && apt install -y firefox-esr
RUN .venv/bin/pip install selenium
COPY infra/debug-ci.py debug-ci.py
CMD .venv/bin/python debug-ci.py
And let’s add a little debug script at debug-ci.py:
from selenium import webdriver
# just try to open a selenium session
webdriver.Firefox().quit()
We build and run it like this:
$ docker build -f infra/Dockerfile.ci -t debug-ci . && \ docker run -it debug-ci [...] => [internal] load build definition from infra/Dockerfile.ci 0.0s => => transferring dockerfile: [...] => [internal] load metadata for docker.io/library/python:slim [...] => [1/8] FROM docker.io/library/python:slim@sha256:[...] => CACHED [2/8] RUN pip install virtualenv 0.0s => CACHED [3/8] RUN virtualenv .venv 0.0s => CACHED [4/8] COPY requirements.txt requirements.txt 0.0s => CACHED [5/8] RUN .venv/bin/pip install -r requirements.txt 0.0s => CACHED [6/8] RUN apt update -y && apt install -y firefox-esr 0.0s => CACHED [7/8] RUN .venv/bin/pip install selenium 0.0s => [8/8] COPY infra/debug-ci.py debug-ci.py 0.0s => exporting to image 0.0s => => exporting layers 0.0s => => writing image sha256:[...] => => naming to docker.io/library/debug-ci 0.0s Traceback (most recent call last): File "//.venv/lib/python3.13/site-packages/selenium/webdriver/common/driver_finder.py", line 67, in _binary_paths output = SeleniumManager().binary_paths(self._to_args()) [...] selenium.common.exceptions.WebDriverException: Message: Unsupported platform/architecture combination: linux/aarch64 The above exception was the direct cause of the following exception: Traceback (most recent call last): File "//debug-ci.py", line 4, in <module> webdriver.Firefox().quit() ~~~~~~~~~~~~~~~~~^^ [...] selenium.common.exceptions.NoSuchDriverException: Message: Unable to obtain driver for firefox; For documentation on this error, please visit: https://www.selenium.dev/documentation/webdriver/troubleshooting/errors/driver_location
You might not see this—that "Unsupported platform/architecture combination" error is spurious, it’s because I was on a Mac. Let’s try again with:
$ docker build -f infra/Dockerfile.ci -t debug-ci --platform=linux/amd64 . && \ docker run --platform=linux/amd64 -it debug-ci [...] Traceback (most recent call last): File "//debug-ci.py", line 4, in <module> webdriver.Firefox().quit() [...] selenium.common.exceptions.WebDriverException: Message: Process unexpectedly closed with status 1
OK, that’s a repro of our issue. But no further clues yet!
Getting debug information out of Selenium can be a bit fiddly.
I tried two avenues, setting options
and setting the service
,
the former of which doesn’t really work as far as I can tell,
but the latter does. There is some limited info in the
Selenium docs.
import subprocess
from selenium import webdriver
options = webdriver.FirefoxOptions() # (1)
options.log.level = "trace"
service = webdriver.FirefoxService( # (2)
log_output=subprocess.STDOUT, service_args=["--log", "trace"]
)
# just try to open a selenium session
webdriver.Firefox(options=options, service=service).quit()
-
This is how I attempted to increase the log level using
options
. I had to reverse-engineer it from the source code, and it doesn’t seem to work anyway, but I thought I"d leave it here for future reference -
This is the
FirefoxService
config class, which does seem to let you print some debug info. I’m configuring it to print to standard-out.
Sure enough we can see some output now!
$ docker build -f infra/Dockerfile.ci -t debug-ci --platform=linux/amd64 . && \ docker run --platform=linux/amd64 -it debug-ci [...] 1234567890111 geckodriver INFO Listening on 127.0.0.1:XXXX 1234567890112 webdriver::server DEBUG -> POST /session {"capabilities": {"firstMatch": [{}], "alwaysMatch": {"browserName": "firefox", "acceptInsecureCerts": true, ... , "moz:firefoxOptions": {"binary": "/usr/bin/firefox", "prefs": {"remote.active-protocols": 3}, "log": {"level": "trace"}}}}} 1234567890111 geckodriver::capabilities DEBUG Trying to read firefox version from ini files 1234567890111 geckodriver::capabilities DEBUG Trying to read firefox version from binary 1234567890111 geckodriver::capabilities DEBUG Found version 128.7esr 1740029792102 mozrunner::runner INFO Running command: MOZ_CRASHREPORTER="1" MOZ_CRASHREPORTER_NO_REPORT="1" MOZ_CRASHREPORTER_SHUTDOWN="1" [...] "--remote-debugging-port" [...] "-no-remote" "-profile" "/tmp/rust_mozprofile[...] 1234567890111 geckodriver::marionette DEBUG Waiting 60s to connect to browser on 127.0.0.1 1234567890111 geckodriver::browser TRACE Failed to open /tmp/rust_mozprofile[...] 1234567890111 geckodriver::marionette TRACE Retrying in 100ms Error: no DISPLAY environment variable specified 1234567890111 geckodriver::browser DEBUG Browser process stopped: exit status: 1 1234567890112 webdriver::server DEBUG <- 500 Internal Server Error {"value":{"error":"unknown error","message":"Process unexpectedly closed with status 1","stacktrace":""}} Traceback (most recent call last): File "//debug-ci.py", line 13, in <module> webdriver.Firefox(options=options, service=service).quit() [...] selenium.common.exceptions.WebDriverException: Message: Process unexpectedly closed with status 1
Well, it wasn’t immediately obvious what’s going on there,
but I did eventually get a clue from the line that says no DISPLAY environment variable specified
Out of curiosity, I thought I’d try running firefox
directly:
$ docker build -f infra/Dockerfile.ci -t debug-ci --platform=linux/amd64 . && \ docker run --platform=linux/amd64 -it debug-ci firefox [...] Error: no DISPLAY environment variable specified
Sure enough, the same error.
And if you search around for this error,
you’ll eventually find enough pointers to the answer:
Firefox is crashing because it can’t find a display.
Servers are "headless", meaning they don’t have a screen.
Thankfully Firefox has a headless mode,
which we can enable by setting an environment variable,
MOZ_HEADLESS
.
Let’s confirm that locally. We’ll use the -e
flag for docker run
:
$ docker build -f infra/Dockerfile.ci -t debug-ci --platform=linux/amd64 . && \ docker run -e MOZ_HEADLESS=1 --platform=linux/amd64 -it debug-ci 1234567890111 geckodriver INFO Listening on 127.0.0.1:43137 [...] *** You are running in headless mode. [...] 1234567890112 webdriver::server DEBUG Teardown [...] 1740030525996 Marionette DEBUG Closed connection 0 1234567890111 geckodriver::browser DEBUG Browser process stopped: exit status: 0 1234567890112 webdriver::server DEBUG <- 200 OK [...]
It takes quite a long time to run, and there’s lots of debug out, but… it looks OK!
Let’s set that environment variable in our CI script:
variables:
# Put pip-cache in home folder so we can use gitlab cache
PIP_CACHE_DIR: "$CI_PROJECT_DIR/.cache/pip"
# Make Firefox run headless.
MOZ_HEADLESS: "1"
Tip
|
Using a local Docker image to repro the CI environment is a hint that it might be worth investing time in running CI in a custom Docker image that you fully control; this is an another way of improving reproducibility. We won’t have time to go into detail in this book though. |
That worked! or at least it almost did. All but one of the FTs passed for me, but there was one unexpected error:
+ python manage.py test functional_tests ......F. ====================================================================== FAIL: test_can_start_a_todo_list (functional_tests.test_simple_list_creation.NewVisitorTest) --------------------------------------------------------------------- Traceback (most recent call last): File "...goat-book/functional_tests/test_simple_list_creation.py", line 38, in test_can_start_a_todo_list self.wait_for_row_in_list_table('2: Use peacock feathers to make a fly') File "...goat-book/functional_tests/base.py", line 51, in wait_for_row_in_list_table raise e File "...goat-book/functional_tests/base.py", line 47, in wait_for_row_in_list_table self.assertIn(row_text, [row.text for row in rows]) AssertionError: '2: Use peacock feathers to make a fly' not found in ['1: Buy peacock feathers'] ---------------------------------------------------------------------
Now you might not see this error, but it’s common for the switch to CI to flush out some "flaky" tests, things that will fail intermittently. In CI a common cause is the "noisy neighbour" problem, where the CI server might be much slower than your own machine, thus flushing out some race conditions, or in this case, just randomly hanging for a few seconds, taking us past the default timeout.
Let’s give ourselves some tools to help debug though.
To be able to debug unexpected failures that happen on a remote server, it would be good to see a picture of the screen at the moment of the failure, and maybe also a dump of the HTML of the page.
We can do that using some custom logic in our FT class tearDown
.
We’ll need to do a bit of introspection of unittest
internals,
a private attribute called ._outcome
,
but this will work:
import os
import time
from datetime import datetime
from pathlib import Path
[...]
MAX_WAIT = 5
SCREEN_DUMP_LOCATION = Path(__file__).absolute().parent / "screendumps"
[...]
def tearDown(self):
if self._test_has_failed():
if not SCREEN_DUMP_LOCATION.exists():
SCREEN_DUMP_LOCATION.mkdir(parents=True)
self.take_screenshot()
self.dump_html()
self.browser.quit()
super().tearDown()
def _test_has_failed(self):
# slightly obscure but couldn't find a better way!
return self._outcome.result.failures or self._outcome.result.errors
We first create a directory for our screenshots if necessary.
Then we iterate through all the open browser tabs and pages,
and use a Selenium methods, get_screenshot_as_file()
and the attribute browser.page_source
,
for our image and HTML dumps, respectively:
def take_screenshot(self):
path = SCREEN_DUMP_LOCATION / self._get_filename("png")
print("screenshotting to", path)
self.browser.get_screenshot_as_file(str(path))
def dump_html(self):
path = SCREEN_DUMP_LOCATION / self._get_filename("html")
print("dumping page HTML to", path)
path.write_text(self.browser.page_source)
And finally here’s a way of generating a unique filename identifier, which includes the name of the test and its class, as well as a timestamp:
def _get_filename(self, extension):
timestamp = datetime.now().isoformat().replace(":", ".")[:19]
return (
f"{self.__class__.__name__}.{self._testMethodName}-{timestamp}.{extension}"
)
You can test this first locally by deliberately breaking one of the tests,
with a self.fail()
for example, and you’ll see something like this:
$ ./src/manage.py test functional_tests.test_my_lists [...] Fscreenshotting to ...goat-book/src/functional_tests/screendumps/MyListsTest.te st_logged_in_users_lists_are_saved_as_my_lists-[...] dumping page HTML to ...goat-book/src/functional_tests/screendumps/MyListsTest. test_logged_in_users_lists_are_saved_as_my_lists-[...] Fscreenshotting to ...goat-book/src/functional_tests/screendumps/MyListsTest.te st_logged_in_users_lists_are_saved_as_my_lists-2025-02-18T11.29.00.png dumping page HTML to ...goat-book/src/functional_tests/screendumps/MyListsTest. test_logged_in_users_lists_are_saved_as_my_lists-2025-02-18T11.29.00.html
We also need to tell gitlab to "save" these files for us to be able to actually look at them This is called "artifacts":
test:
[...]
script:
[...]
artifacts: # (1)
when: always # (2)
paths: # (1)
- src/functional_tests/screendumps/
-
artifacts
is the name of the key, and thepaths
argument is fairly self-explanatory. You can use wildcards here, more info in the GitLab docs. -
One thing the docs didn’t make obvious is that you need
when: always
because otherwise it won’t save artifacts for failed jobs. That was annoyingly hard to figure out!
In any case that should work. If you commit the code and then push it back to Gitlab, we should be able to see a new build job.
$ echo "src/functional_tests/screendumps" >> .gitignore $ git commit -am "add screenshot on failure to FT runner" $ git push
In its output, we’ll see the screenshots and html dumps being saved:
screendumps/LoginTest.test_can_get_email_link_to_log_in-window0-2014-01-22T17.45.12.html Fscreenshotting to /builds/hjwp/book-example/src/functional_tests/screendumps/NewVisitorTest.test_can_start_a_todo_list-2025-02-17T17.51.01.png dumping page HTML to /builds/hjwp/book-example/src/functional_tests/screendumps/NewVisitorTest.test_can_start_a_todo_list-2025-02-17T17.51.01.html Not Found: /favicon.ico .screenshotting to /builds/hjwp/book-example/src/functional_tests/screendumps/NewVisitorTest.test_multiple_users_can_start_lists_at_different_urls-2025-02-17T17.51.06.png dumping page HTML to /builds/hjwp/book-example/src/functional_tests/screendumps/NewVisitorTest.test_multiple_users_can_start_lists_at_different_urls-2025-02-17T17.51.06.html ====================================================================== FAIL: test_can_start_a_todo_list (functional_tests.test_simple_list_creation.NewVisitorTest.test_can_start_a_todo_list) [...]
And to the right some new UI options appear to Browse the artifacts, as in Artifacts Appear on the Right of the Build Job.
And if you navigate through, you’ll see something like Our Screenshot in the GitLab UI, Looking Unremarkable:
Hm. No obvious clues there. Well, when in doubt, bump the timeout, as the old adage goes:
MAX_WAIT = 10
Then we can rerun the build by pushing, and confirm it now works,
At this point we should get a working pipeline, A Successful GitLab Pipeline:
There’s a set of tests we almost forgot—the JavaScript tests. Currently our "test runner" is an actual web browser. To get them running in CI, we need a command-line test runner.
Note
|
Our JavaScript tests currently test the interaction between our code and the bootstrap framework/CSS, so we still need a real browser to be able to make our visibility checks work. |
Thankfully, the Jasmine docs point us straight towards the kind of tool we need: Jasmine Browser Runner.
It’s time to stop pretending we’re not in the JavaScript game. We’re doing web development. That means we do JavaScript. That means we’re going to end up with node.js on our computers. It’s just the way it has to be.
Follow the instructions on the node.js homepage, and follow the instructions there. It should guide you through installing the "node version manager" (NVM), and then to getting the latest version of node.
$ nvm install 22 # or whichver the latest version is Installing Node v22.14.0 (arm64) [...] $ node -v v22.14.0
The docs suggest we install it like this,
and then run the init
command to generate a default config file:
$ cd src/lists/static $ npm install --save-dev jasmine-browser-runner jasmine-core [...] added 151 packages in 4s $ cat package.json # this is the equivalent of requirement.txt { "devDependencies": { "jasmine-browser-runner": "^3.0.0", "jasmine-core": "^5.6.0" } } $ ls node_modules/ # will show several dozen directories $ npx jasmine-browser-runner init Wrote configuration to spec/support/jasmine-browser.mjs.
Well we now have about a million files in node_modules/ (which is JavaScript’s verions of a virtualenv essentially), and we also have a new config file in spec/support/jasmine-browser.mjs.
That’s not the ideal place, because we’ve said our tests live in a folder called tests, so let’s move the config file in there.
$ mv spec/support/jasmine-browser.mjs tests/jasmine-browser-runner.config.mjs $ rm -rf spec
Then let’s edit it slightly, to specify a few things correctly:
export default {
srcDir: ".", // (1)
srcFiles: [
"*.js"
],
specDir: "tests", // (2)
specFiles: [
"**/*[sS]pec.js"
],
helpers: [
"helpers/**/*.js"
],
env: {
stopSpecOnExpectationFailure: false,
stopOnSpecFailure: false,
random: true,
forbidDuplicateNames: true
},
listenAddress: "localhost",
hostname: "localhost",
browser: {
name: "headlessFirefox" // (3)
}
};
-
our source files are in the current directory, src/lists/static, ie lists.js
-
and our spec files are in tests/
-
And here we say we want to use the "headless" version of Firefox. (we could have done this by setting
MOZ_HEADLESS
at the command line again, but this saves us from having to remember).
Let’s try running it now. We use the --config
option to path it
the now non-standard path to the config file:
$ npx jasmine-browser-runner runSpecs --config=tests/jasmine-browser-runner.config.mjs Jasmine server is running here: http://localhost:62811 Jasmine tests are here: ...goat-book/src/lists/static/tests Source files are here: ...goat-book/src/lists/static Running tests in the browser... Randomized with seed 17843 Started .F. Failures: 1) Superlists tests error message should be hidden on input Message: Expected true to be false. Stack: <Jasmine> @http://localhost:62811/spec/Spec.js:46:40 <Jasmine> 3 specs, 1 failure Finished in 0.014 seconds Randomized with seed 17843 (jasmine-browser-runner runSpecs --seed=17843)
Could be worse! 1 failure out of 3 specs.
Unfortunately, it’s the most important test:
it("error message should be hidden on input", () => {
initialize(inputSelector);
textInput.dispatchEvent(new InputEvent("input"));
expect(errorMsg.checkVisibility()).toBe(false);
});
Ah yes, if you remember I said, the whole reason we need to use a browser-based test runner, is because our visibility checks depend on the bootstrap CSS framework?
In the HTML spec runner which we’d configured so far,
we load Bootstrap using a <link>
tag:
<!-- Bootstrap CSS -->
<link href="../bootstrap/css/bootstrap.min.css" rel="stylesheet">
And here’s how we load it for jasmine-browser-runner
:
export default {
srcDir: ".",
srcFiles: [
"*.js"
],
specDir: "tests",
specFiles: [
"**/*[sS]pec.js"
],
cssFiles: [ // (1)
"bootstrap/css/bootstrap.min.css" // (1)
],
helpers: [
"helpers/**/*.js"
],
-
The
cssFiles
key is how you tell the runner to load, er, some CSS. I found that out in the docs
Let’s give that a go…
$ npx jasmine-browser-runner runSpecs --config=tests/jasmine-browser-runner.config.mjs Jasmine server is running here: http://localhost:62901 Jasmine tests are here: /Users/harry.percival/workspace/Book-TDD-Web-Dev-Python/source/chapter_25_CI/superlists/src/lists/static/tests Source files are here: /Users/harry.percival/workspace/Book-TDD-Web-Dev-Python/source/chapter_25_CI/superlists/src/lists/static Running tests in the browser... Randomized with seed 06504 Started ... 3 specs, 0 failures Finished in 0.016 seconds Randomized with seed 06504 (jasmine-browser-runner runSpecs --seed=06504)
Hooray! That works locally, let’s get it into CI.
# add the package.json, which saves our node depenencies $ git add src/lists/static/package.json src/lists/static/package-lock.json # ignore the node_modules/ directory $ echo "node_modules/" >> .gitignore # and our config file $ git add src/lists/static/tests/jasmine-browser-runner.config.mjs $ git commit -m "config for node + jasmine-browser-runner for JS tests"
We now want two different build steps,
so let’s rename test
to test-python
and move all its
specific bits like variables
and before_script
inside it,
and then create a separate step calleed test-js
,
with a similar structure:
test-python:
# Use the same image as our Dockerfile
image: python:slim # (1)
variables: # (1)
# Put pip-cache in home folder so we can use gitlab cache
PIP_CACHE_DIR: "$CI_PROJECT_DIR/.cache/pip"
# Make Firefox run headless.
MOZ_HEADLESS: "1"
cache: # (1)
paths:
- .cache/pip
# "setUp" phase, before the main build
before_script: # (1)
- python --version ; pip --version # For debugging
- pip install virtualenv
- virtualenv .venv
- source .venv/bin/activate
script:
- pip install -r requirements.txt
# unit tests
- python src/manage.py test lists accounts
# (if those pass) all tests, incl. functional.
- apt update -y && apt install -y firefox-esr
- pip install selenium
- cd src && python manage.py test
artifacts:
when: always
paths:
- src/functional_tests/screendumps/
test-js: # (2)
image: node:slim
script:
- apt update -y && apt install -y firefox-esr # (3)
- cd src/lists/static
- npm install # (4)
- npx jasmine-browser-runner runSpecs
--config=tests/jasmine-browser-runner.config.mjs # (5)
-
image
,variables
,cache
, andbefore_script
all move out of the top level and into thetest-python
step, since they’re all specific to this step only, now. -
Here’s our new step,
test-js
. -
We install Firefox into the node image, just like we do for the Python one.
-
We don’t need to specify what to
npm install
, because that’s all in the package-lock.json file. -
And here’s our command to run the tests.
And slap me over the head with a wet fish if that doesn’t pass first go! See Wow, There are Those JS Tests, Passing on the First Attempt! for a successful pipeline run.
And there we are! A complete CI build featuring all of our tests! Here are Both Our Jobs in all their Green Glory:
Nice to know that, no matter how lazy I get about running the full test suite on my own machine, the CI server will catch me. Another one of the Testing Goat’s agents in cyberspace, watching over us…
I’ve moved it to an appendix tho, cos it’s so gitlab-heavy.
I want to give a shout out to Woodpecker CI and Forgejo, two of the newer self-hosted CI options. And while I’m at it, to Jenkins, which did a great job for the first and second editions, and still does for many people.
If you want true independence from overly commercial interests, then self-hosted is the way to go. You’ll need your own server for both of these.
I tried both, and managed to get them working within an hour or two. Their documentation is good.
If you do decide to give them a go, I’d say, be a bit cautious about security options.
For example, you might decide you don’t want any old person from the Internet to be able to sign up for an account on your server:
DISABLE_REGISTRATION: true
But more power to you for giving it a go, I say!
We spent quite a bit of time debugging, for example the unhelpful messages when Firefox wasn’t installed. Just as we did when preparing our deployment, being able to have an environment that you can run on your local machine that’s as close as possible to what you have remotely, is a big help; that’s why we chose to use Docker image.
In CI our tests also run a Docker image (python:slim
and node:slim
),
so one common pattern is to define a Docker image,
in your repo, that you will use for CI.
Ideally it would also be as similar as possible to the one you use in production!
A typical solution here is to use "multi-stage" Docker builds,
with a base stage, a prod stage, and a dev/ci stage.
In our case, that latter would have Firefox, selenium,
and other test-only dependencies in it, that we don’t need for prod.
You can then run your tests locally inside the same Docker image that’s used in CI. This
Tip
|
Reproducibility is one of the key attributes we’re aiming for. The more your project grows in complexity, the more it’s worth investing in minimising the differences between local dev, CI, and prod. |
We touched on the use of caches in CI for the pip download cache, but as CI pipelines grow in maturity, you’ll find you can make more and more use of caching.
It’s a topic for another time, but this is yet another way of trying to speed up the feedback cycle.
The natural next step is to finish our journey into automation, and set up a pipeline that will deploy our code all the way to production, each time we push code… as long as the tests pass!
I work through an example of how to do that in [appendix_CD]. I definitely encourage you to take a look.
Now, onto our last chapter of coding, everyone!
- Set up CI as soon as possible for your project
-
As soon as your functional tests take more than a few seconds to run, you’ll find yourself avoiding running them all. Give this job to a CI server, to make sure that all your tests are getting run somewhere.
- Optimise for fast feedback
-
CI feedback loops can be frustratingly slow. Optimising things to get results quicker is worth the effort. Run your fastest tests first, and try to minimise time spent on, eg, dependency installation, by using caches.
- Set up screenshots and HTML dumps for failures
-
Debugging test failures is easier if you can see what the page looked like when the failure occurred. This is particularly useful for debugging CI failures, but it’s also very useful for tests that you run locally.
- Be prepared to bump your timeouts
-
A CI server may not be as speedy as your laptop, especially if it’s under load, running multiple tests at the same time. Be prepared to be even more generous with your timeouts, in order to minimise the chance of random failures.
- Take the next step, CD (Continuous Delivery)
-
Once we’re running tests automatically, we can take the next step which is to automated our deployments (when the tests pass). See [appendix_CD] for a worked example.