Skip to content

Commit

Permalink
Merge pull request rice-crc#272 from JohnMulligan/deployment_notes
Browse files Browse the repository at this point in the history
Deployment notes
  • Loading branch information
derekjkeller authored Feb 3, 2025
2 parents 2c6ccad + 3476703 commit 0ab9215
Show file tree
Hide file tree
Showing 12 changed files with 271 additions and 125 deletions.
11 changes: 8 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

The following notes provide an overview of how to install and run the SV API project, which is a restructuring of SlaveVoyages.org to bring it closer to a true microservices model.

For notes on how to use the API, see the [Project Structure readme file](PROJECT_STRUCTURE.md)
For notes on the project structure, see the [Project Structure readme file](PROJECT_STRUCTURE.md)

For a Swagger UI presentation of the API documentation's generic public endpoints, go to the root url of the endpoint:

Expand Down Expand Up @@ -87,7 +87,7 @@ _Note: you can remove the `-d` option to run the process in the foreground. JCM
Allow a short bit of time for the mysql container to initialize. Then inject the sql dump.

```bash
local:~/Projects/voyages-api$ docker exec -i voyages-mysql mysql -uroot -pvoyages voyages_api < data/voyages_prod.sql
local:~/Projects/voyages-api$ docker exec -i voyages-mysql mysql -uroot -pvoyages voyages_api < data/voyages_prod.sql
```

Verify the data import.
Expand All @@ -112,7 +112,6 @@ local:~/Projects/voyages-api$ docker exec -i voyages-solr solr create_core -c au
local:~/Projects/voyages-api$ docker exec -i voyages-solr solr create_core -c autocomplete_enslaved_names -d /srv/voyages/solr
local:~/Projects/voyages-api$ docker exec -i voyages-solr solr create_core -c autocomplete_enslaver_aliases -d /srv/voyages/solr
local:~/Projects/voyages-api$ docker exec -i voyages-api bash -c 'python3 manage.py rebuild_indices'
local:~/Projects/voyages-api$ docker exec -i voyages-api bash -c 'python3 manage.py rebuild_autocomplete_indices'
```

Build the API component containers.
Expand All @@ -121,6 +120,12 @@ Build the API component containers.
local:~/Projects/voyages-api$ docker compose up --build -d voyages-geo-networks voyages-people-networks voyages-stats
```

Rebuild the map routes (this can take some time).

```bash
local:~/Projects/voyages-api$ docker exec -i voyages-geo-networks bash -c 'flask pickle rebuild'
```

## Generating an API Key for the Flask Components

The Flask components of the app require an API key.
Expand Down
180 changes: 180 additions & 0 deletions docs/DEPLOYMENT_STEPS_AND_REQS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,180 @@
# Voyages3 API Deployment Notes

The Voyages3 API project is based on a microservices model. A Django application models the data, manages the database, and processes requests to various endpoints.

In some cases, Django itself retrieves, formats, and returns the requested data. This is usually when the client is requesting full record(s), such as all the data on one more more voyages.

In many cases, however, the data being requested requires aggregation or indexing, based on both legacy and more recent custom functionality. Processing these requests live in Django would either be highly inefficient or require customized algorithms in Django which tend to be memory-inefficient or fragile.

Therefore, for requests such as maps and graphs, Django acts as an API gateway. It receives the client requests, farms these out to dedicated containers (typically written in Flask) that have indexed the API's data, and formats the containers' responses before returning the data to the client.

In order to achieve this, the specialized containers have to index Voyages data when they are instantiated.

## Container Requirements

These specifications exclude pre-built containers such as:

* MySQL
* Redis
* Mailhog
* Adminer

### geo-networks

Requirements:

* a localsettings file with
* a ```TMP_PATH``` where large pickle files can be stored
* a valid key to voyages-api
* the base URL for the voyages-api endpoints
* a valid mount point at ```TMP_PATH``` for those pickle files to be stored to
* rebuilder_number_of_workers

### stats and people-networks

Requirement: a localsettings file with:

* a valid key to voyages-api
* the base URL for the voyages-api endpoints

### people-networks

Requirement: a localsettings file with:

* a valid key to voyages-api
* the base URL for the voyages-api endpoints

### solr

No special requirements beyond basic solr files which are found in the repo.

### redis

No requirements. Managed by the docker-compose file.

### voyages-api

A localsettings file as specified in ```src/voyages-api/localsettings.py-example```. Key variables in this file that should be changed for deployment are:

* Django's ```SECRET_KEY```
* The settings in ```DATABASES```

#### VOYAGES-API static files

This container also requires, as do most Django projects, a ```static``` directory, mostly for Django plugins' stylesheets and js files.

However, it also requires two folders for custom static objects. Currently, these are located under the ```static``` directory

##### A. Blog images

However, it also requires a folder for static files that cannot be contained in the codebase (though precisely this was historically done in the voyages repo, god help us). Specifically, the blog platform allows users to upload images for use in posts. Currently, this path is set to ```static/uploads/```.

This current configuration (of uploads going to ```static/uploads/```) relies on two settings, currently established in settings.py:

First, the uploads folder's location is set as follows

# Default FileBrowser site
site = FileBrowserSite(name='filebrowser')
site.storage.location = "static"
site.directory="uploads/"
site.storage.base_url = "/static/uploads"

Second, of course, the above relies on the static folder being set as: ```STATIC_ROOT='static'```. You'll therefore see below, in the deployment steps, a requirement to create two static folders.

##### B. IIIF Manifests

The document viewer relies on third-party IIIF image servers -- but those servers' image manifests are not up to snuff. We therefore have to generate manifests for our documents that leverage those image servers but use our own metadata.

These manifests currently live in ```static/iiif_manifests```

## Instructions

To deploy the project in a production setting, the following steps should be followed, in order.

### 1. Database dump

Download the latest database dump from the Google Drive project share and
expand into the `data/` directory. Rename the expanded file to `data/voyages_prod.sql`.

Create the MySQL database. Then inject the database dump: ```sudo mysql voyages_api < data/voyages_prod.sql```

### 2. API containers

Build the core service containers (API and Solr).

```bash
local:~/Projects/voyages-api$ docker compose up --build -d voyages-api voyages-adminer voyages-solr voyages-redis
```

Run the app setup and configuration tasks in the below order.

#### 2a. Standard Django management commands:

The below commands should be run on the Django voyages-api container to build out static assets (which Swagger and other plugins use) and to bring the database into sync with any model changes in the codebase.

```bash
local:~/Projects/voyages-api$ docker exec -i voyages-api bash -c 'python3 manage.py migrate'
local:~/Projects/voyages-api$ docker exec -i voyages-api bash -c 'python3 manage.py collectstatic --noinput'
```

#### 2b. Django static assets:

This project has several custom Django management commands. Only one of these commands must be run to start the project from scratch, and they relate to the deployment of static assets (see the section on ```VOYAGES-API static files``` above).

Currently, those static assets live in folders under ```api/static/```. First, we'll want to make sure those subdirectories exist (I haven't included them in the repo because I don't know if Derek will want to move them out of ```api/static```):

```bash
local:~/Projects/voyages-api$ mkdir -p src/api/static/iiif_manifests/'
local:~/Projects/voyages-api$ mkdir -p src/api/static/uploads/'
```

Then we'll want to regenerate the IIIF manifests (or these can be shared via Google Drive, as they are lightweight and don't change often):

```bash
local:~/Projects/voyages-api$ docker exec -i voyages-api bash -c 'python3 manage.py iiif_generate_manifests'
```

And finally, we'll want to populate the uploads folder at ```api/static/uploads``` as these assets are used by the blog. *Those* assets should be shared via Google Drive.

#### 2c. Solr commands:

If you are building from scratch, you will have to create the necessary solr cores:

```
local:~/Projects/voyages-api$ docker exec -i voyages-solr solr create_core -c voyages -d /srv/voyages/solr
local:~/Projects/voyages-api$ docker exec -i voyages-solr solr create_core -c enslavers -d /srv/voyages/solr
local:~/Projects/voyages-api$ docker exec -i voyages-solr solr create_core -c enslaved -d /srv/voyages/solr
local:~/Projects/voyages-api$ docker exec -i voyages-solr solr create_core -c blog -d /srv/voyages/solr
local:~/Projects/voyages-api$ docker exec -i voyages-solr solr create_core -c autocomplete_source_titles -d /srv/voyages/solr
local:~/Projects/voyages-api$ docker exec -i voyages-solr solr create_core -c autocomplete_ship_names -d /srv/voyages/solr
local:~/Projects/voyages-api$ docker exec -i voyages-solr solr create_core -c autocomplete_enslaved_names -d /srv/voyages/solr
local:~/Projects/voyages-api$ docker exec -i voyages-solr solr create_core -c autocomplete_enslaver_aliases -d /srv/voyages/solr
```

Then you will want to use those solr cores to build out indexes using the below custom Django management command (this will take approx. 1 minute):

```
local:~/Projects/voyages-api$ docker exec -i voyages-api bash -c 'python3 manage.py rebuild_indices'
```

#### 2d. Flask Containers

Build the API component containers.

```bash
local:~/Projects/voyages-api$ docker compose up --build -d voyages-geo-networks voyages-people-networks voyages-stats
```

##### 2d.1 Note on voyages-geo-networks

This container relies on highly resource-intensive calculations. In order to be responsive in production, its results must therefore be indexed ahead of time. These indexes are saved as pickles in the ```voyages-geo-network``` container's ```TMP_PATH``` directory, which is ignored by git.

When the container is started, it looks in ```TMP_PATH``` to find these pickle files. It loads those it finds. If any are missing, it generates them and then saves them to ```TMP_PATH``` so they can be used at the next startup.

Obviously, if the database is updated, such as with the addition of new voyages, then this index needs to be updated. The command can therefore be invoked manually, as below:

```bash
local:~/Projects/voyages-api$ docker exec -i voyages-geo-networks bash -c 'flask pickle rebuild'
```

Generating all of the indices can take up to 30 minutes. However, if more cores are added to the container, this indexing process can be multi-threaded by setting the ```rebuilder_number_of_workers``` variable to the number of cores. And yet, outside of this indexing workflow, more than 1 core is overkill for this container, so it would be difficult to realize the value of this innovation without some interesting cloud architecture.
4 changes: 4 additions & 0 deletions docs/PROJECT_STRUCTURE.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,10 @@ In order to directly inspect the database (when this is necessary), we have inst

This service is used to index entities in the database (Voyages, Enslaved People, Enslavers, and Blog Posts). Currently, it is only being used for the global search feature. However, it would make sense to use this to index documents at well (though full-text search always requires some tuning).

### Voyages-Redis

A Redis container acts as a cache using k-v pairs. The keys are hashes generated by the Voyages-API django app to store json requests, and the values are the validated responses.

## Python Containers

These containers run python services in Flask or Django. The "main" API container runs a barebones Django web app, which the other processes essentially index the data from, in order to provide efficient use of these data (for instance, when aggregating to make data visualizations).
Expand Down
64 changes: 0 additions & 64 deletions docs/dev_notes.md

This file was deleted.

Original file line number Diff line number Diff line change
Expand Up @@ -64,6 +64,8 @@ def handle(self, *args, **options):

manifest_sources=Source.objects.all().filter(~Q(page_connections__page=None))

manifest_sources=manifest_sources.filter(short_ref__name__icontains="docp")

print(manifest_sources)

#screen out sources that lack either pages
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -381,6 +381,7 @@ def handle(self, *args, **options):
source.zotero_url = zotero_url

source.bib=bib
print(bib)

#get any existing attached date object
#and overwrite it or delete it as necessary
Expand Down
11 changes: 11 additions & 0 deletions src/api/document/views.py
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,8 @@ def post(self,request):
cached_response = redis_cache.get(hashed)
else:
cached_response=None



def solrfilter(queryset,core_name,search_string):

Expand Down Expand Up @@ -99,19 +101,28 @@ def solrfilter(queryset,core_name,search_string):

source_title=srd.get('title')


if source_title is not None:
queryset=queryset.filter(title__icontains=source_title)
print("TITLE",source_title,queryset.count())


bib=srd.get('bib')


if bib is not None:
queryset=queryset.filter(bib__icontains=bib)

print("BIB",bib,queryset.count())

enslavers=srd.get('enslavers')


if enslavers is not None:
queryset=solrfilter(queryset,'enslavers',enslavers)

print("ENSLAVERS",enslavers,queryset.count())

if queryset.count()>0:

ids=[i[0] for i in queryset.values_list('id')]
Expand Down
Loading

0 comments on commit 0ab9215

Please sign in to comment.