Skip to content

Commit

Permalink
Merge pull request #67 from DHI-GRAS/categorical
Browse files Browse the repository at this point in the history
Implement crude categorical data handling
  • Loading branch information
dionhaefner authored Sep 17, 2018
2 parents 698a085 + 681e6d8 commit 4e63e91
Show file tree
Hide file tree
Showing 21 changed files with 489 additions and 568 deletions.
81 changes: 80 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -195,11 +195,90 @@ For all available settings, their types and default values, have a look at the f
[config.py](https://github.com/DHI-GRAS/terracotta/blob/master/terracotta/config.py) in the
Terracotta code.

## Advances recipes

### Serving categorical data

Categorical datasets are special in that the numerical pixel values carry no direct meaning,
but rather encode which category or label the pixel belongs to. Because labels must be preserved,
serving categorical data comes with its own set of complications:

- Dynamical stretching does not make sense
- Nearest neighbor resampling must be used
- Labels must be mapped to colors consistently

So far, Terracotta is agnostic of categories and labels, but the API is flexible enough to give
you the tools to build your own system. Categorical data can be served by following these steps:

#### During ingestion

1. Create an additional key to encode whether a dataset is categorical or not. E.g., if you are
currently using the keys `sensor`, `date`, and `band`, ingest your data with the keys
`[type, sensor, date, band]`, where `type` can take one of the values `categorical`, `index`,
`reflectance`, or whatever makes sense for your given application.
2. Attach a mapping `category name -> pixel value` to the metadata of your categorical dataset.
Using the Python API, this could e.g. be done like this:

```python
import terracotta as tc

driver = tc.get_driver('terracotta.sqlite')

# assuming keys are [type, sensor, date, band]
keys = ['categorical', 'S2', '20181010', 'cloudmask']
raster_path = 'cloud_mask.tif'

category_map = {
'clear land': 0,
'clear water': 1,
'cloud': 2,
'cloud shadow': 3
}

with driver.connect():
metadata = driver.compute_metadata(raster_path, extra_metadata={'categories': category_map})
driver.insert(keys, raster_path, metadata=metadata)
```

#### In the frontend

Ingesting categorical data this way allows us to access it from the frontend. Given that your
Terracotta server runs at `example.com`, you can use the following functionality:

- To get a list of all categorical data, simply send a GET request to
`example.com/datasets?type=categorical`.
- To get the available categories of a dataset, query
`example.com/metadata/categorical/S2/20181010/cloudmask`. The returned JSON object will contain
a section like this:

```json
{
"extra_metadata": {
"categories": {
"clear land": 0,
"clear water": 1,
"cloud": 2,
"cloud shadow": 3
}
}
}
```
- To get correctly labelled imagery, the frontend will have to pass an explicit color mapping of pixel
values to colors by using `/singleband`'s `explicit_color_map` argument. In our case, this could look
like this:
`example.com/singleband/categorical/S2/20181010/cloudmask/{z}/{x}/{y}.png?colormap=explicit&explicit_color_map={"0": "99d594", "1": "2b83ba", "2": "ffffff", "3": "404040"}`.

Supplying an explicit color map in this fashion suppresses stretching, and forces Terracotta to only use
nearest neighbor resampling when reading the data.

Colors can be passed as hex strings (as in this example) or RGB color tuples. In case you are looking
for a nice color scheme for your categorical datasets, [color brewer](http://colorbrewer2.org) features
some excellent suggestions.

## Deployment on AWS λ

The easiest way to deploy Terracotta on AWS λ is by using [Zappa](https://github.com/Miserlou/Zappa).


Example `zappa_settings.json` file:

```json
Expand Down
43 changes: 23 additions & 20 deletions terracotta/api/legend.py → terracotta/api/colormap.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
"""api/keys.py
Flask route to handle /legend calls.
Flask route to handle /colormap calls.
"""

from typing import Any, Mapping, Dict
Expand All @@ -13,16 +13,16 @@
from terracotta.cmaps import AVAILABLE_CMAPS


class LegendEntrySchema(Schema):
class ColormapEntrySchema(Schema):
value = fields.Number(required=True)
rgb = fields.List(fields.Number(), required=True, validate=validate.Length(equal=3))


class LegendSchema(Schema):
legend = fields.Nested(LegendEntrySchema, many=True, required=True)
class ColormapSchema(Schema):
colormap = fields.Nested(ColormapEntrySchema, many=True, required=True)


class LegendOptionSchema(Schema):
class ColormapOptionSchema(Schema):
class Meta:
unknown = EXCLUDE

Expand All @@ -31,8 +31,11 @@ class Meta:
description='Minimum and maximum value of colormap as JSON array '
'(same as for /singleband and /rgb)'
)
colormap = fields.String(description='Name of color map to use (see /colormap)',
missing=None, validate=validate.OneOf(AVAILABLE_CMAPS))
colormap = fields.String(
description='Name of color map to use (for a preview see '
'https://matplotlib.org/examples/color/colormaps_reference.html)',
missing=None, validate=validate.OneOf(AVAILABLE_CMAPS)
)
num_values = fields.Int(description='Number of values to return', missing=255)

@pre_load
Expand All @@ -48,36 +51,36 @@ def process_ranges(self, data: Mapping[str, Any]) -> Dict[str, Any]:
return data


@metadata_api.route('/legend', methods=['GET'])
@metadata_api.route('/colormap', methods=['GET'])
@convert_exceptions
def get_legend() -> str:
"""Get a legend mapping pixel values to colors
def get_colormap() -> str:
"""Get a colormap mapping pixel values to colors
---
get:
summary: /legend
summary: /colormap
description:
Get a legend mapping pixel values to colors. Use this to construct a color bar for a
Get a colormap mapping pixel values to colors. Use this to construct a color bar for a
dataset.
parameters:
- in: query
schema: LegendOptionSchema
schema: ColormapOptionSchema
responses:
200:
description: Array containing data values and RGBA tuples
schema: LegendSchema
schema: ColormapSchema
400:
description: Query parameters are invalid
"""
from terracotta.handlers.legend import legend
from terracotta.handlers.colormap import colormap

input_schema = LegendOptionSchema()
input_schema = ColormapOptionSchema()
options = input_schema.load(request.args)

payload = {'legend': legend(**options)}
payload = {'colormap': colormap(**options)}

schema = LegendSchema()
schema = ColormapSchema()
return jsonify(schema.load(payload))


spec.definition('LegendEntry', schema=LegendEntrySchema)
spec.definition('Legend', schema=LegendSchema)
spec.definition('ColormapEntry', schema=ColormapEntrySchema)
spec.definition('Colormap', schema=ColormapSchema)
37 changes: 0 additions & 37 deletions terracotta/api/colormaps.py

This file was deleted.

6 changes: 2 additions & 4 deletions terracotta/api/flask_api.py
Original file line number Diff line number Diff line change
Expand Up @@ -82,10 +82,9 @@ def create_app(debug: bool = False,
new_app.debug = debug

# import submodules to populate blueprints
import terracotta.api.colormaps
import terracotta.api.datasets
import terracotta.api.keys
import terracotta.api.legend
import terracotta.api.colormap
import terracotta.api.metadata
import terracotta.api.rgb
import terracotta.api.singleband
Expand All @@ -95,10 +94,9 @@ def create_app(debug: bool = False,

# register routes on API spec
with new_app.test_request_context():
spec.add_path(view=terracotta.api.colormaps.get_colormaps)
spec.add_path(view=terracotta.api.datasets.get_datasets)
spec.add_path(view=terracotta.api.keys.get_keys)
spec.add_path(view=terracotta.api.legend.get_legend)
spec.add_path(view=terracotta.api.colormap.get_colormap)
spec.add_path(view=terracotta.api.metadata.get_metadata)
spec.add_path(view=terracotta.api.rgb.get_rgb)
spec.add_path(view=terracotta.api.singleband.get_singleband)
Expand Down
67 changes: 53 additions & 14 deletions terracotta/api/singleband.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,8 @@
from typing import Any, Mapping, Dict
import json

from marshmallow import Schema, fields, validate, pre_load, ValidationError, EXCLUDE
from marshmallow import (Schema, fields, validate, validates_schema,
pre_load, ValidationError, EXCLUDE)
from flask import request, send_file

from terracotta.api.flask_api import convert_exceptions, tile_api
Expand All @@ -29,19 +30,56 @@ class Meta:
description='Stretch range to use as JSON array, uses full range by default. '
'Null values indicate global minimum / maximum.', missing=None
)
colormap = fields.String(description='Colormap to apply to image (see /colormap)',
missing=None, validate=validate.OneOf(AVAILABLE_CMAPS))

colormap = fields.String(
description='Colormap to apply to image (see /colormap)',
validate=validate.OneOf(('explicit', *AVAILABLE_CMAPS)), missing=None
)

explicit_color_map = fields.Dict(
keys=fields.Number(),
values=fields.List(fields.Number, validate=validate.Length(equal=3)),
example='{{0: (255, 255, 255)}}',
description='Explicit value-color mapping to use as JSON object. '
'Must be given together with colormap=explicit. Color values can be '
'specified either as RGB tuple (in the range of [0, 255]), or as '
'hex strings.'
)

@validates_schema
def validate_cmap(self, data: Mapping[str, Any]) -> None:
if data.get('colormap', '') == 'explicit' and not data.get('explicit_color_map'):
raise ValidationError('explicit_color_map argument must be given for colormap=explicit',
'colormap')

if data.get('explicit_color_map') and data.get('colormap', '') != 'explicit':
raise ValidationError('explicit_color_map can only be given for colormap=explicit',
'explicit_color_map')

@pre_load
def process_ranges(self, data: Mapping[str, Any]) -> Dict[str, Any]:
def decode_json(self, data: Mapping[str, Any]) -> Dict[str, Any]:
data = dict(data.items())
var = 'stretch_range'
val = data.get(var)
if val:
try:
data[var] = json.loads(val)
except json.decoder.JSONDecodeError as exc:
raise ValidationError(f'Could not decode value for {var} as JSON') from exc
for var in ('stretch_range', 'explicit_color_map'):
val = data.get(var)
if val:
try:
data[var] = json.loads(val)
except json.decoder.JSONDecodeError as exc:
msg = f'Could not decode value {val} for {var} as JSON'
raise ValidationError(msg) from exc

val = data.get('explicit_color_map')
if val and isinstance(val, dict):
for key, color in val.items():
if isinstance(color, str):
hex_string = color.lstrip('#')
try:
rgb = [int(hex_string[i:i + 2], 16) for i in (0, 2, 4)]
data['explicit_color_map'][key] = rgb
except ValueError:
msg = f'Could not decode value {color} in explicit_color_map as hex string'
raise ValidationError(msg)

return data


Expand Down Expand Up @@ -79,8 +117,9 @@ def get_singleband(tile_z: int, tile_y: int, tile_x: int, keys: str) -> Any:
option_schema = SinglebandOptionSchema()
options = option_schema.load(request.args)

image = singleband(
parsed_keys, tile_xyz, **options
)
if options.get('colormap', '') == 'explicit':
options['colormap'] = options.pop('explicit_color_map')

image = singleband(parsed_keys, tile_xyz, **options)

return send_file(image, mimetype='image/png')
3 changes: 2 additions & 1 deletion terracotta/drivers/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,8 @@ def get_metadata(self, keys: Union[Sequence[str], Mapping[str, str]]) -> Dict[st
def get_raster_tile(self, keys: Union[Sequence[str], Mapping[str, str]], *,
bounds: Sequence[float] = None,
tilesize: Sequence[int] = (256, 256),
nodata: Number = 0) -> np.ndarray:
nodata: Number = 0,
preserve_values: bool = False) -> np.ndarray:
"""Get raster tile as a NumPy array for given keys."""
pass

Expand Down
Loading

0 comments on commit 4e63e91

Please sign in to comment.