Skip to content

Commit

Permalink
Set orthanc raw maximum storage in Dockerfile (#179)
Browse files Browse the repository at this point in the history
* Set orthanc raw maximum storage in Dockerfile

* Orthanc raw defaults to no limit on storage

Saves us from faffing around with build args
for other tests

* Format README

* Document `ORTHANC_RAW_MAXIMUM_STORAGE_SIZE`

* Add simple test for `ORTHANC_RAW_MAXIMUM_STORAGE_SIZE`

---------

Co-authored-by: Milan Malfait <[email protected]>
  • Loading branch information
stefpiatek and milanmlft authored Dec 13, 2023
1 parent 55972d5 commit 9363697
Show file tree
Hide file tree
Showing 7 changed files with 75 additions and 15 deletions.
2 changes: 1 addition & 1 deletion .env.sample
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ ORTHANC_RAW_USERNAME=
ORTHANC_RAW_PASSWORD=
ORTHANC_RAW_AE_TITLE=
ORTHANC_AUTOROUTE_RAW_TO_ANON=true
ORTHANC_RAW_MAXIMUM_STORAGE_SIZE=
ORTHANC_RAW_MAXIMUM_STORAGE_SIZE= // MB

# PIXL Orthanc anon instance
ORTHANC_ANON_USERNAME=
Expand Down
55 changes: 44 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
# PIXL

PIXL Image eXtraction Laboratory

`PIXL` is a system for extracting, linking and de-identifying DICOM imaging data, structured EHR data and free-text data from radiology reports at UCLH.
Expand All @@ -8,92 +9,124 @@ PIXL is intended run on one of the [GAE](https://github.com/UCLH-Foundry/Book-of
several services orchestrated by [Docker Compose](https://docs.docker.com/compose/).

## Services

### [PIXL CLI](./cli/README.md)

Primary interface to the PIXL system.

### [Hasher API](./hasher/README.md)

HTTP API to securely hash an identifier using a key stored in Azure Key Vault.

### [Orthanc Raw](./orthanc/orthanc-raw/README.md)

A DICOM node which receives images from the upstream hospital systems and acts as cache for PIXL.

### [Orthanc Anon](./orthanc/orthanc-anon/README.md)

A DICOM node which wraps our de-identifcation and cloud transfer components.

### PostgreSQL

RDBMS which stores DICOM metadata, application data and anonymised patient record data.

### [Electronic Health Record Extractor](./pixl_ehr/README.md)
HTTP API to process messages from the `ehr` queue and populate raw and anon tables in the PIXL postgres instance.

HTTP API to process messages from the `ehr` queue and populate raw and anon tables in the PIXL postgres instance.

### [PACS Image Extractor](./pixl_pacs/README.md)
HTTP API to process messages from the `pacs` queue and populate the raw orthanc instance with images from PACS/VNA.

HTTP API to process messages from the `pacs` queue and populate the raw orthanc instance with images from PACS/VNA.

## Setup

### 0. Choose deployment environment

This is one of dev|test|staging|prod and referred to as `<environment>` in the docs.

### 1. Initialise environment configuration

Create a local `.env` and `pixl_config.yml` file in the _PIXL_ directory:

```bash
cp .env.sample .env && cp pixl_config.yml.sample pixl_config.yml
```

Add the missing configuration values to the new files:

#### Environment

Set `ENV` to `<environment>`.

#### Credentials

- `EMAP_DB_`*
UDS credentials are only required for `prod` or `staging` deployments of when working on the EHR & report retriever component.
You can leave them blank for other dev work.
You can leave them blank for other dev work.
- `PIXL_DB_`*
These are credentials for the containerised PostgreSQL service and are set in the official PostgreSQL image.
These are credentials for the containerised PostgreSQL service and are set in the official PostgreSQL image.
Use a strong password for `prod` deployment but the only requirement for other environments is consistency as several services interact with the database.
- `PIXL_EHR_API_AZ_`*
These credentials are used for uploading a PIXL database to Azure blob storage. They should be for a service principal that has `Storage Blob Data Contributor`
on the target storage account. The storage account must also allow network access from the PIXL host machine.

#### Ports

Most services need to expose ports that must be mapped to ports on the host. The host port is specified in `.env`
Ports need to be configured such that they don't clash with any other application running on that GAE.

#### Storage size

The maximum storage size of the `orthanc-raw` instance can be configured through the `ORTHANC_RAW_MAXIMUM_STORAGE_SIZE`
environment variable in `.env`. This limits the storage size to the specified value (in MB). When the storage is full
[Orthanc will automatically recycle older studies in favour of new ones](https://orthanc.uclouvain.be/book/faq/features.html#id8).

## Run

### Start

From the _PIXL_ directory:

```bash
bin/pixldc pixl_dev up
```

### Stop

From the _PIXL_ directory:

```bash
bin/pixldc pixl_dev down
```

## Analysis

The number of DICOM instances in the raw Orthanc instance can be accessed from
`http://<pixl_host>:<ORTHANC_RAW_WEB_PORT>/ui/app/#/settings` and similarly with
`http://<pixl_host>:<ORTHANC_RAW_WEB_PORT>/ui/app/#/settings` and similarly with
the Orthanc Anon instance, where `pixl_host` is the host of the PIXL services
and `ORTHANC_RAW_WEB_PORT` is defined in `.env`.

The number of reports and EHR can be interrogated by connecting to the PIXL
database with a database client (e.g. [DBeaver](https://dbeaver.io/)), using
the connection parameters defined in `.env`. For example, to find the number of
The number of reports and EHR can be interrogated by connecting to the PIXL
database with a database client (e.g. [DBeaver](https://dbeaver.io/)), using
the connection parameters defined in `.env`. For example, to find the number of
non-null reports

```sql
select count(*) from emap_data.ehr_anon where xray_report is not null;
```


## Develop
See each service's README for instructions for individual developing and testing instructions.

See each service's README for instructions for individual developing and testing instructions.
For Python development we use [isort](https://github.com/PyCQA/isort) and [black](https://black.readthedocs.io/en/stable/index.html) alongside [pytest](https://www.pytest.org/).
There is support (sometimes through plugins) for these tools in most IDEs & editors.
Before raising a PR, **run the full test suite** from the _PIXL_ directory with

```bash
bin/run-all-tests.sh
```
and not just the component you have been working on as this will help us catch unintentional regressions without spending GH actions minutes :-)

and not just the component you have been working on as this will help us catch unintentional regressions without spending GH actions minutes :-)

We run [pre-commit](https://pre-commit.com/) as part of the GitHub Actions CI. To install and run it locally, do:

Expand Down
2 changes: 1 addition & 1 deletion docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -157,6 +157,7 @@ services:
dockerfile: ./docker/orthanc-raw/Dockerfile
args:
<<: *build-args-common
ORTHANC_RAW_MAXIMUM_STORAGE_SIZE: ${ORTHANC_RAW_MAXIMUM_STORAGE_SIZE}
command: /run/secrets
environment:
<<: [*pixl-db, *proxy-common, *pixl-common-env]
Expand All @@ -182,7 +183,6 @@ services:
- type: volume
source: orthanc-raw-data
target: /var/lib/orthanc/db
- ${PWD}/orthanc/orthanc-raw/config:/run/secrets:ro
networks:
- pixl-net
depends_on:
Expand Down
7 changes: 6 additions & 1 deletion docker/orthanc-raw/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -14,4 +14,9 @@
FROM osimis/orthanc:22.9.0-full-stable
SHELL ["/bin/bash", "-o", "pipefail", "-e", "-u", "-x", "-c"]

COPY ./orthanc/orthanc-raw/plugin/pixl.py /etc/orthanc/pixl.py
ARG ORTHANC_RAW_MAXIMUM_STORAGE_SIZE

COPY ./orthanc/orthanc-raw/plugin/pixl.py /etc/orthanc/pixl.py
# Orthanc can't substitute environment veriables as integers so copy and replace before running
COPY ./orthanc/orthanc-raw/config /run/secrets
RUN sed -i "s/\${ORTHANC_RAW_MAXIMUM_STORAGE_SIZE}/${ORTHANC_RAW_MAXIMUM_STORAGE_SIZE:-0}/g" /run/secrets/orthanc.json
2 changes: 1 addition & 1 deletion orthanc/orthanc-raw/config/orthanc.json
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@

// Limit the maximum storage size
"MaximumPatientCount" : 0, // no limit
"MaximumStorageSize" : ${ORTHANC_RAW_MAXIMUM_STORAGE_SIZE}, // MB
"MaximumStorageSize" : ${ORTHANC_RAW_MAXIMUM_STORAGE_SIZE}, // MB, replaced in Dockerfile because its an integer
"MaximumStorageMode" : "Recycle",


Expand Down
1 change: 1 addition & 0 deletions test/run-system-test.sh
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ pixl start
sleep 65 # need to wait until the DICOM image is "stable" = 60s
./scripts/check_entry_in_pixl_anon.sh
./scripts/check_entry_in_orthanc_anon.sh
./scripts/check_max_storage_in_orthanc_raw.sh

cd "${PACKAGE_DIR}"
docker compose -f docker-compose.yml -f ../docker-compose.yml -p test down
Expand Down
21 changes: 21 additions & 0 deletions test/scripts/check_max_storage_in_orthanc_raw.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
#!/bin/bash
# Copyright (c) University College London Hospitals NHS Foundation Trust
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
set -eux pipefail

# This could be much improved by having more realistic test data some of
# which actually was persisted
source ./.env.test
docker logs test-orthanc-raw-1 2>&1 | grep "At most ${ORTHANC_RAW_MAXIMUM_STORAGE_SIZE}MB will be used for the storage area"

0 comments on commit 9363697

Please sign in to comment.