From b2b4d5c1e1a1e6af8d84b8b47676bfebf673aa78 Mon Sep 17 00:00:00 2001 From: Jonathan Matthews Date: Mon, 30 Dec 2024 10:59:43 +0000 Subject: [PATCH] docs: update for containerised setup This adds some docs to help a developer get started modifying and running YNR inside a container. Also: structure README.md's headings more hierarchically, leaving only a single H1 on the page; some light rewording in README.md to improve its flow, along with a short new section that points a member of the public towards whocanivotefor.co.uk if they happen to come across this repo. --- README.md | 86 +++++---- docs/DEVELOPMENT.md | 242 ++++++++++++++++++++++++ docs/INSTALL.md | 157 +++++++-------- docs/INSTALL.old.md | 91 +++++++++ ynr/apps/sopn_parsing/README.md | 3 + ynr/settings/local.py.container.example | 4 + 6 files changed, 455 insertions(+), 128 deletions(-) create mode 100644 docs/DEVELOPMENT.md create mode 100644 docs/INSTALL.old.md diff --git a/README.md b/README.md index 1cddc3d44..20bd6f366 100644 --- a/README.md +++ b/README.md @@ -3,73 +3,81 @@ [![Ruff](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json)](https://github.com/astral-sh/ruff) ![CodeQL](https://github.com/DemocracyClub/yournextrepresentative/workflows/CodeQL/badge.svg) +# A website for crowd-sourcing structured data about election candidates -# A website for crowd-sourcing structured election candidate data +[**candidates.democracyclub.org.uk**](https://candidates.democracyclub.org.uk) -https://candidates.democracyclub.org.uk/ - -YourNextRepresentative is a open source platform for -crowd-sourcing information about candidates for political office +YourNextRepresentative ("**YNR**") is an open source platform +for crowd-sourcing information about candidates for political office, and making it available as open data to anyone. +YNR collects some core data, including: +- who is standing, +- what party they’re standing for, +- their contact details, and +- their social media accounts. + +YNR requires that each change is submitted with a source, so that the collected +information can be verified. + +## Using YNR + +**To find out information** about who you can vote for in upcoming elections, head +over to [whocanivotefor.co.uk](https://whocanivotefor.co.uk) and search for +candidates in your area. -The core data that YourNextRepresentative collects includes who -is standing, what party they’re standing for, their contact -details, their social media accounts etc. The software requires -that each change is submitted with a source, so that the -collected information can be independently checked. +**To contribute information** about candidates, use the YNR application at +[candidates.democracyclub.org.uk](https://candidates.democracyclub.org.uk). -# Installation +## Developing YNR -See [INSTALL.md](https://github.com/DemocracyClub/yournextrepresentative/blob/master/docs/INSTALL.md) +Before you can start modifying the YNR application and website, you'll need to +install its development prerequisites -- as detailed in +[`docs/INSTALL.md`](docs/INSTALL.md). -# Known Bugs +After you've confirmed that the prerequisites are working correctly on your +machine you'll be able to use the workflows detailed in +[`docs/DEVELOPMENT.md`](docs/DEVELOPMENT.md) to make changes to YNR. + +## Known Bugs You can find a list of known issues to work on here: * https://github.com/DemocracyClub/yournextrepresentative/issues -# Acknowledgements +## Acknowledgements This codebase was originally forked from [mysociety/yournextrepresentative](http://github.com/mysociety/yournextrepresentative) -We no longer track the upstream but we thank [mySociety](http://mysociety.org/) +We no longer track the upstream but we thank [mySociety](https://mysociety.org/) for their work on the project which we have been able to build on. -# API Versions +## API Versions v0.9 is legacy code and is now frozen. v1.0 is currently in alpha. We plan on publishing a v1 API once we have some more feedback from users and we think it’s stable enough. -# Statement Of Persons Nominated (SOPN) Parsing - -YNR uses `pypandoc` (which relies on `pandoc`) to convert SOPN documents to PDF, as needed, to be parsed. +## Statement Of Persons Nominated (SOPN) Parsing -To install `pandoc`, visit this page and follow the instructions for you operating system: -https://pandoc.org/installing.html +See [`ynr/apps/sopn_parsing`](ynr/apps/sopn_parsing#readme). -Once `pandoc` is installed +## Sentry Error Reporting -Install pypandoc (or via `requirements.txt`): +Sentry is used to report errors in production. We have added a url for `sentry-debug` to the [`urls.py`](ynr/urls.py#L42) file. This is to allow us verify that Sentry is configured correctly and working in production. -`pip install pandoc` +## Pre-election Tasks -If `pypandoc` does not install via `pip`, visit https://pypi.org/project/pypandoc/ for further instructions. +### Enable Candidate Leaderboard -# Sentry Error Reporting - -Sentry is used to report errors in production. We have added a url for `sentry-debug` to the `urls.py` file. This is to allow us verify that Sentry is configured correctly and working in production. - -``` - -# Pre-election Tasks - -# Enable Candidate Leaderboard - -The candidate leaderboard is a way of showing the most active candidates on the site. It is a way of encouraging volunteers to add more information about candidates and elections. +The candidate leaderboard shows the most active contributors to the site. +It is a way of encouraging volunteers to add more information about candidates and elections. We take a slice of edits in YNR and assign them to a election leaderboard. - -This is defined here: https://github.com/DemocracyClub/yournextrepresentative/blob/master/ynr/apps/candidates/views/mixins.py#L20 +This is defined in [`ynr/apps/candidates/views/mixins.py`](ynr/apps/candidates/views/mixins.py#L20). We can modify the old value to reflect the current election. Change, PR, merge, [currently Sym needs to deploy] -If this is a General Election, the parliamentary candidates can be imported using a google sheet csv url with `python manage candidatebot_import_next_ppcs --sheet-url SHEET_URL` +If this is a General Election, the parliamentary candidates can be imported using a google sheet csv url with: +``` +podman compose up -d dbpqsl +./scripts/container.run.bash python manage candidatebot_import_next_ppcs --sheet-url SHEET_URL +podman compose down +``` diff --git a/docs/DEVELOPMENT.md b/docs/DEVELOPMENT.md new file mode 100644 index 000000000..06f8e61bd --- /dev/null +++ b/docs/DEVELOPMENT.md @@ -0,0 +1,242 @@ +# Developing YNR locally + +This is a guide to the tools and workflows that you'll need to use when +changing and updating YNR on your development machine. + +## Installation + +Install and test the development prerequisites as detailed in +[INSTALL.md](INSTALL.md). This will leave you with a known-good setup that's +able to run the commands described in this guide. + +## Quick start + +### Restoring data + +If you have access to a database dump from a YNR instance you can restore it to +a containerised database as follows: + +1. Start the database container: + `podman compose up -d dbpsql` +1. Restore the database dump: + ``` + cat path/to/database.dump \ + | podman compose exec -T dbpsql pg_restore -d ynr -U ynr --no-owner + ``` +1. Apply any pending migrations: + `./scripts/container.run.bash python manage.py migrate` +1. Shut down the database container: + `podman compose down` + + + +### Running the app + +1. Add or update any environment variables in `env/frontend.env` as required. +1. Incorporate the settings from + [`ynr/settings/local.py.container.example`](../ynr/settings/local.py.container.example) + into your gitignored `ynr/settings/local.py` file. +1. Start the compose stack: + `podman compose up -d` +1. (In a separate terminal) Start tailing the stack's logs: + `podman compose logs --follow` (you can safely CTRL-C this process at any time). +1. Build some required JS resources in the running frontend container: + `./scripts/container.exec.bash npm run build` +1. (**If your active `ynr/settings/...` file does NOT include `DEBUG = True`**) \ + Collect the static assets: + `./scripts/container.manage-py.bash collectstatic --no-input` +1. Browse to [http://localhost:8080](http://localhost:8080) +1. Changes made inside `ynr/` will be immediately available to the app, which + will be auto-reloaded. +1. Remember to shut down the compose stack when you're done: + `podman compose down` + +### Testing your changes + +#### If the app is capable of being started + +1. Start the compose stack: + `podman compose up -d` +1. Run the test suite, stopping on first failure: + `./scripts/container.pytest.bash -x` +1. Stop the compose stack: + `podman compose down` + +#### If the app can't be started, and you need to run the test suite to figure out why + +1. Start the compose stack's database server: + `podman compose up -d dbpsql` +1. Run the entire test suite in a new container (without starting the app): + `./scripts/container.run.bash pytest` +1. Stop the compose stack: + `podman compose down` + +### Running Django management commands + +As detailed [later in this guide](#scripts), there are several different ways +to run a command inside a frontend container. +The method described here uses a +dedicated script to invoke a Django management command inside a frontend +container that's already running the webapp. +(If you need to run a command but don't want to start the webapp, use the more +general `container.run.bash` script instead). + +Run a Django management command: + +1. Add or update any environment variables in `env/frontend.env` as required. +1. Start the compose stack: + `podman compose up -d` +1. Use the `container.manage-py.bash` script to invoke the command: + ``` + ./scripts/container.manage-py.bash command-to-invoke --command-args command params + ``` +1. Stop the compose stack: + `podman compose down` + +After you stop the compose stack, any files added or changed by the management +command inside the `ynr` directory will be persisted directly on your machine. +The same applies to any files mentioned in +[`docker-compose.yml`](../docker-compose.yml), +in the `frontend` container's "`volumes`" section. +**Any changes the management command makes to files *outside* those locations +will be lost when you stop the compose stack**. +Changes to the database are persisted in the database's data volume. + +## Working with Podman + +YNR uses a container runtime called [Podman](https://podman.io/). You can think +of it as like Docker, but able to run without needing a persistent background +daemon and without requiring root access. +Podman provides the `podman` command which is intended to be CLI-compatible +with much of the `docker` command. + +You've also been asked to install `podman-compose` - a separate project that +gives the `podman` command its `podman compose` subcommand. You shouldn't need +to invoke `podman-compose` (with a hyphen) directly. The `podman compose` +command works with the "compose stack" defined in +[docker-compose.yml](../docker-compose.yml), which comprises two services: the +`frontend` webapp, and the Postgres `dbpsql` service. + +### Working on the webapp + +Use `podman compose up -d` to start the compose stack, with the webapp exposed +on [localhost:8080](http://localhost:8080). Changes you make inside the `ynr/` +directory are automatically reflected in the running app. Changes to other +entries in the frontend's [docker-compose.yml](../docker-compose.yml) list of +volumes that are bind-mounted from your local checkout of this repo are also +immediately visible to the running app. + +View the app (and DB) logs with `podman compose logs --follow`. It's a good +idea to run this immediately after starting the stack. + +Shut down the stack with `podman compose down`. This is always safe to run, +even when the stack is already stopped. It deliberately leaves the database's +data behind as a "volume", so that Postgres can access it the next time you +start the stack. If you need to delete the database's contents completely, run +`podman compose down --volumes`. + +`podman compose ...` subcommands do provide the expected `--help` flag, but +some of the docs aren't perfect. Here's a summary of the commands you might +run: + + +| Command                                                   | Purpose | Notes +| :--- | :--- | :--- +| `podman compose up -d` | Start the entire stack. | `-d` forks the action into the background, which is optional but strongly recommended. +| `podman compose up -d dbpsql` | Start only the named container in the stack. | +| `podman compose down` | Stop any running containers in the stack. | +| `podman compose down --volumes` | Stop any running containers in the stack and also destroy their persistent data. | Anything bind-mounted from your local repo into the `frontend` webapp container is left untouched. | +| `podman compose ps` | Display the status of containers in the stack. | +| `podman volume ls` | List the persistent volumes that podman controls on your machine. | +| `podman compose build` | Rebuild the webapp's frontend container image. | +| `podman compose build --no-cache` | Rebuild the webapp's frontend container image from scratch. | Takes several minutes to finish. +| `podman compose logs` | Display the last N stdout/stderr lines emitted by any running containers. | +| `podman compose logs --follow` | Display the last N stdout/stderr lines emitted by any running containers, and then wait for more lines. | +| `podman system reset` | Destroy everything that Podman controls. | "Everything seems to have gone wrong, so I'll just start from scratch". It wipes out all containers, networks, images, volumes, etc ... so **avoid this if possible!** | + +### Scripts + +These executable scripts are available from the [`scripts`](../scripts) directory. + + +| Script | Purpose | Parameters +| :--- | :--- | :--- +| `container.image.build.bash` | Builds the YNR container image | $1 -- The named stage from [`container/build/Containerfile`](../container/build/Containerfile) to build and tag (*required*)
$2, $3, ... -- Any parameters to pass to the underlying builder process (*optional*) +| `container.exec.bash` | Runs a command inside the already-running `frontend` container | The unquoted command to run (*required*) +| `container.manage-py.bash` | Runs a Django management command inside the already-running `frontend` container | The unquoted command to run (*required*) +| `container.pytest.bash` | Runs `pytest` inside the already-running `frontend` container | Any parameters for Pytest (*optional*) +| `container.run.bash` | Runs a command inside a freshly-instantiated, ephemeral `frontend` container | The unquoted command to run (*required*) + +### Rebuilding the application container image + +You will need to rebuild the application's container image if you change any of +the application's dependencies, across any of the packaging ecosystems it +currently relies on: + +- `container/build/system-packages`: System / APT dependendencies +- `package{,-lock}.json`: Node dependencies +- `requirements/*.txt`: Python dependencies +- `.dockerignore`: Container build-time file dependencies + +The above list is presented in descending order of how slow a rebuild will be, +if a particular package ecosystem's dependencies are changed. +Changing a system dependency, for example, forces a longer rebuild than +changing a Python dependency. +**You do not need to rebuild the application's container image if you only +change files in the `ynr/` directory**. Changes to the YNR application are +picked up automatically when using the compose stack locally (as described +elsewhere in this guide). + +The build process for the YNR application is encoded in +[`container/build/Containerfile`](../container/build/Containerfile). +This Docker-compatible file describes two image stages, `prod` and `test`, with +`test` being built on top of `prod`. +Locally, on your development machine, you will need to use the `test` stage. + +#### Build the `test` stage using a build cache + +``` +./scripts/container.image.build.bash test +``` + +#### Build the `test` stage without a build cache + +Avoiding the use of your local build cache significantly increases the time it +takes to build the container image, but is sometimes useful when there's a +problem with external dependencies (e.g. if a important update has been +published for an APT package but it's not visible in the container's package +index). + +``` +./scripts/container.image.build.bash test --no-cache +``` diff --git a/docs/INSTALL.md b/docs/INSTALL.md index cb909a955..a8a4b534e 100644 --- a/docs/INSTALL.md +++ b/docs/INSTALL.md @@ -1,91 +1,70 @@ # Installation -TODO: improve these docs with more detail - -YourNextRepresentative requires python >=3.5 and PostgreSQL - -## Install python dependencies - -``` -pip install -U pip -pip install -r requirements.txt -``` - -## Set up database - -``` -sudo -u postgres createdb ynr -``` - -If using mac-os/homebrew -``` -createdb ynr -``` - -``` -cp ynr/settings/local.py.example ynr/settings/local.py -``` - -Add database credentials to `DATABASES` dict in `local.py` - -``` -brew install libmagic -./manage.py migrate -``` - -To populate the database run from the live site run: - -``` -python manage.py candidates_import_from_live_site -``` - -(Note that this command will take multiple hours to complete.) - -## Build frontend assets - -``` -npm run build -npm install -``` - -## (Optional) Code linting - -A CI will check all code against Black and Flake8. To save pushing commits that don't -pass these tests you can configure pre-commmit hooks. - -Do this by installing `[precommit](https://pre-commit.com/)`: - -``` -pip install pre-commit -pre-commit install -``` - -## (Optional) SOPN parsing - -SOPNs parsing (see `ynr/apps/sopn_parsing/README.md`) is optional -because it depends on various system packages beyond python packages. - -It currently requires [camelot-py](https://camelot-py.readthedocs.io/en/master/user/install.html#install) -and that in turn requires `python-tk` and `ghostscript`. - -Read up on how to install them, and then install the SOPN parsing requirements: - -``` -pip install -r requirements/sopn_parsing.txt -``` - -File conversion relies on `pandoc` to turn non-pdf SOPN files into pdf files. -To install `pandoc`, visit https://pandoc.org/installing.html and follow instructions -for Mac OS and Ubuntu. - -AWS Textract relies on the following packages for viewing image results: - -https://pypi.org/project/pdf2image/ - -To install these packages run: - -``` -brew install poppler -``` - -_If you have omitted SOPN and are having problems getting the project to run, you may need to follow the SOPN steps._ \ No newline at end of file +## Local development + +To develop YNR on your local machine you'll first need to install its +containerisation prerequisites. We use containers in development to isolate the +(non-trivial!) set of *application* prerequisites away from your local machine, +and to get closer to the intended future state of the application's +*production* deployment. + +### Install and test containerisation prerequisites + +1. Clone this repository: + `git clone --branch jcm/wip https://github.com/DemocracyClub/yournextrepresentative` +1. Install the `podman` command: https://podman.io/docs/installation. + These installation mechanisms have been tested: + - System package on Ubuntu 24.04 LTS + - https://podman.io/docs/installation#ubuntu +1. Install the `podman-compose` command: https://pypi.org/project/podman-compose/. + These installation mechanisms have been tested: + - System package on Ubuntu 24.04 LTS + - This version (v1.0.x) emits non-optional verbose debug logs + - https://packages.ubuntu.com/noble/podman-compose + - `apt install podman-compose` + - Manual installation of v1.2.0 APT package on Ubuntu 24.04 LTS + - This version's verbose debug logs are optional + - https://packages.ubuntu.com/oracular/all/podman-compose/download + - `dkpkg -i path/to/debian-package.deb` + - Local `pip` installation of v1.2.0 on Ubuntu 24.04 LTS + - This version's verbose debug logs are optional + - https://pypi.org/project/podman-compose/ + - `pip install podman-compose` + - Either inside a venv, or not, as you prefer +1. Configure `podman` to be less chatty, by placing this configuration in `$HOME/.config/containers/containers.conf`: + ```ini + # Don't emit logs on each invocation of the compose command indicating + # that an external compose provider is being executed. + [engine] + compose_warning_logs=false + ``` +1. Make sure the `bash` shell is available: + `which bash || echo Not found` +1. Build any container images used by the compose stack: + `podman compose build` +1. Pull any 3rd-party container images used by the compose stack: + `podman compose pull` +1. Set up your local/development envvars as needed, by placing keys and values + `env/frontend.env`, like this: + ``` + DJANGO_SETTINGS_MODULE=ynr.settings.testing + ``` +1. Test that the compose stack can be stood up: + ```bash + podman compose up -d # NB Space between "podman" and "compose"! + curl 0:8080 + ``` + Curl **should** report a server error (i.e. a 500) because your database + setup is incomplete. This step tests only that `podman` and `podman-compose` + are able to run successfully on your machine when given YNR's + `docker-compose.yml` file. +1. Test that Django management commands can be invoked: + `./scripts/container.manage-py.bash check` +1. Run the test suite (which only requires that a database server be + *available*, not that it contains any specific data). + This will take a little time to finish: + `./scripts/container.pytest.bash` +1. Shut down the compose stack: + `podman compose stop` + +Now you can use the tools and workflows detailed in [DEVELOPMENT.md](DEVELOPMENT.md). diff --git a/docs/INSTALL.old.md b/docs/INSTALL.old.md new file mode 100644 index 000000000..cb909a955 --- /dev/null +++ b/docs/INSTALL.old.md @@ -0,0 +1,91 @@ +# Installation + +TODO: improve these docs with more detail + +YourNextRepresentative requires python >=3.5 and PostgreSQL + +## Install python dependencies + +``` +pip install -U pip +pip install -r requirements.txt +``` + +## Set up database + +``` +sudo -u postgres createdb ynr +``` + +If using mac-os/homebrew +``` +createdb ynr +``` + +``` +cp ynr/settings/local.py.example ynr/settings/local.py +``` + +Add database credentials to `DATABASES` dict in `local.py` + +``` +brew install libmagic +./manage.py migrate +``` + +To populate the database run from the live site run: + +``` +python manage.py candidates_import_from_live_site +``` + +(Note that this command will take multiple hours to complete.) + +## Build frontend assets + +``` +npm run build +npm install +``` + +## (Optional) Code linting + +A CI will check all code against Black and Flake8. To save pushing commits that don't +pass these tests you can configure pre-commmit hooks. + +Do this by installing `[precommit](https://pre-commit.com/)`: + +``` +pip install pre-commit +pre-commit install +``` + +## (Optional) SOPN parsing + +SOPNs parsing (see `ynr/apps/sopn_parsing/README.md`) is optional +because it depends on various system packages beyond python packages. + +It currently requires [camelot-py](https://camelot-py.readthedocs.io/en/master/user/install.html#install) +and that in turn requires `python-tk` and `ghostscript`. + +Read up on how to install them, and then install the SOPN parsing requirements: + +``` +pip install -r requirements/sopn_parsing.txt +``` + +File conversion relies on `pandoc` to turn non-pdf SOPN files into pdf files. +To install `pandoc`, visit https://pandoc.org/installing.html and follow instructions +for Mac OS and Ubuntu. + +AWS Textract relies on the following packages for viewing image results: + +https://pypi.org/project/pdf2image/ + +To install these packages run: + +``` +brew install poppler +``` + +_If you have omitted SOPN and are having problems getting the project to run, you may need to follow the SOPN steps._ \ No newline at end of file diff --git a/ynr/apps/sopn_parsing/README.md b/ynr/apps/sopn_parsing/README.md index f8cb89e91..875ba63fa 100644 --- a/ynr/apps/sopn_parsing/README.md +++ b/ynr/apps/sopn_parsing/README.md @@ -2,6 +2,9 @@ This app is designed to extract useful information out of UK Statement Of Persons Nominated documents (SOPNs), published before elections. +It uses the `pypandoc_binary` dependency (which both provides an interface to, +and installs the binary of, the upstream [pandoc](https://pandoc.org/) project) +to convert SOPN documents to PDF (as needed) to be parsed. The documents contain information on candidates for a given election, but are published in a wide variety of layouts. diff --git a/ynr/settings/local.py.container.example b/ynr/settings/local.py.container.example index 48179889f..3b9325fd2 100644 --- a/ynr/settings/local.py.container.example +++ b/ynr/settings/local.py.container.example @@ -1,6 +1,10 @@ # Only set DEBUG to True in development environments. DEBUG = True +# These Postgres settings should match docker-compose.yml. +DATABASES["default"]["NAME"] = "ynr" # noqa +DATABASES["default"]["USER"] = "ynr" # noqa + # This short, known value is insecure. SECRET_KEY = "development"