Skip to content

Commit

Permalink
Merge branch 'staging' into main
Browse files Browse the repository at this point in the history
  • Loading branch information
vcschapp committed Dec 17, 2024
2 parents 08916f8 + ad86e75 commit edfd638
Show file tree
Hide file tree
Showing 17 changed files with 287 additions and 275 deletions.
33 changes: 28 additions & 5 deletions .github/pull_request_template.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,36 @@
# Category

What kind of change is this?
Please select *one* of the following four options.

Please select *one* of the following five options.

Consult [Pull request merging criteria](https://github.com/OvertureMaps/schema-wg#Pull-request-merging-criteria) for a description of each category.

1. [ ] Cosmetic change.
2. [ ] Documentation change by member.
3. [ ] Documentation change by Overture tech writer.
4. [ ] Material change.
1. [ ] MAJOR schema change as defined in [Schema versioning and stability](https://lf-overturemaps.atlassian.net/wiki/x/GgDa).
2. [ ] MINOR schema change as defined in [Schema versioning and stability](https://lf-overturemaps.atlassian.net/wiki/x/GgDa).
3. [ ] Cosmetic change.
4. [ ] Documentation change by member.
5. [ ] Documentation change by Overture tech writer.

# Major change release plan

TODO: For any non-MAJOR change, delete this whole section.

*For a MAJOR change as defined in [Schema versioning and stability](https://lf-overturemaps.atlassian.net/wiki/x/GgDa),
indicate the expected release date, related minor change steps, and your
public documentation and messaging plan.*

## A. Expected release date for this MAJOR change

TODO.

## B. Related MINOR change steps

- TODO. List each related MINOR change as a bullet.

## C. Public documentation and messaging lan

TODO.

# Description

Expand Down
34 changes: 34 additions & 0 deletions .github/workflows/github-actions-enforce-change-type-label.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
name: Change Type label verification

on:
pull_request:
types: [opened, edited, labeled, unlabeled, synchronize]

jobs:
check-label:
runs-on: ubuntu-latest
steps:
- name: Require exactly one change type label
uses: actions/github-script@v6
with:
script: |
const allChangeTypeLabels = new Set([
'change type - cosmetic 🌹',
'change type - documentation - docs team 📝',
'change type - documentation - member 📝',
'change type - major 🚨',
'change type - minor 🤏',
]);
const prLabels = context.payload.pull_request.labels.map(label => label.name);
const appliedChangeTypeLabels = prLabels.filter(prLabel => allChangeTypeLabels.has(prLabel));
if (appliedChangeTypeLabels.length !== 1) {
const baseMessage = `The PR must have EXACTLY one of the following CHANGE TYPE labels: ${Array.from(allChangeTypeLabels).sort().join(', ')}. `
const n = appliedChangeTypeLabels.length;
let contextualMessage;
if (n === 0) {
contextualMessage = 'It currently has no change type label. Please ➕ add one label. 🙏'
} else {
contextualMessage = `It currently has ${n} change type labels (${JSON.stringify(appliedChangeTypeLabels)}). 🙏 Please ❌ remove ${n-1} label(s).`
}
core.setFailed(baseMessage + contextualMessage);
}
69 changes: 39 additions & 30 deletions docs/schema/0-Schema.mdx
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: Overture Schema
title: Overview
slug: /schema

# This page is available at docs.overturemaps.org/schema
Expand All @@ -11,44 +11,58 @@ import JSONSchemaViewer from "@theme/JSONSchemaViewer";
import generateResolverOptions from "@site/src/components/shared-libs/generateResolverOptions"
import StringifyObject from "@site/src/components/shared-libs/stringifyObject"
import yamlLoad from "@site/src/components/yamlLoad"
import EmpireStateBuilding from "!!raw-loader!@site/docs/_examples/buildings/empire-state-building.json";
import MainDefs from "!!raw-loader!@site/docs/_schema/defs.yaml";

Overture data is structured by three components: the schema, the data model, and the Global Entity Reference System ([GERS](https://docs.overturemaps.org/gers/)). The schema describes the shape of the data and devises the constraints applied to that data. The data model specifies what types of features exist, their geometries, how the features relate to each other, and what kind of properties they have. GERS is a framework for structuring, encoding, and matching Overture data to a shared universal reference.
## A unified schema
Overture is developing one schema to structure all of our datasets. We follow the [JSON schema standard](https://json-schema.org/) in our schema design and we use [GeoJSON](https://geojson.org/) as a model for encoding feature geometries in our datasets. [The schema itself is written in YAML](https://github.com/OvertureMaps/schema/blob/dev/schema/schema.yaml) for readability and ease of use.

### GeoJSON and GeoParquet
Although JSON and GeoJSON serve as our mental models for defining the Overture schema, we distribute our datasets in [GeoParquet](https://geoparquet.org/), a column-oriented format optimized for handling large-scale geospatial datasets.

<!-- You can see all three components in this description of one feature, the Empire State Building:
There are key differences in how geometries and other feature properties are represented in GeoJSON and GeoParquet. In Overture's schema design, we follow the GeoJSON specification and encode geometry objects as human-readable Point, LineString, Polygon, and MultiPolygon types. A feature in GeoJSON consists of a single geometry object accompanied by a set of properties represented as key-value pairs.

<CodeBlock language="json">{ EmpireStateBuilding }</CodeBlock> -->
This same feature can be represented as a single row in a GeoParquet file, with the geometry in one column &mdash; encoded as [Well-Known Binary (WKB)](https://libgeos.org/specifications/wkb/) or [native arrow-encoded coordinate columns](https://geoarrow.org/format.html) format &mdash; and other feature properties filling out additional columns in the file.

## GeoJSON mental model

The Overture schema is defined by the [JSON schema](https://json-schema.org/), and [GeoJSON](https://geojson.org/) is used as the canonical geospatial format. GeoJSON provides us with a mental model and language to express data constructions in the schema. The Overture schema supports the following geometry types: Point, LineString, Polygon, MultiPoint, MultiLineString, and MultiPolygon. Together geometric objects and their properties are called features.
### Top-level properties
In the Overture schema, all features have a unique `id` called a [GERS ID](https://docs.overturemaps.org/gers/), a `geometry` object that follows the GeoJSON schema specification, and the following top-level properties:

## Features represent entities
<JSONSchemaViewer
schema={ yamlLoad(MainDefs) }
resolverOptions={ generateResolverOptions( {yamlBasePath: '/', jsonPointer:"#/$defs/propertyContainers/overtureFeaturePropertiesContainer" })}/>

Overture uses the [simple feature model](https://www.ogc.org/standard/sfa/) specified by the Open Geospatial Consortium to describe each feature. Features in Overture represent entities in the real world. An entity is a physical thing or concept: a segment of road, a city boundary, a building, or a park. In most cases it's helpful to think of an entity and a feature as the same thing, but in practice it can be more complicated. An entity could be represented by multiple features in a geospatial dataset, and a feature in a dataset might be a representation of multiple entities. For example, a school building and its entrances and exits might be considered a single entity in the real world but could be represented as multiple features in an Overture dataset, each feature with a unique ID.
<!-- Below is an example of how you can reference just 1 property within the properties container
<JSONSchemaViewer
schema={ yamlLoad(MainDefs) }
resolverOptions={ generateResolverOptions( {yamlBasePath: '/', jsonPointer:"#/$defs/propertyContainers/overtureFeaturePropertiesContainer/properties/geometry" })}/>
-->

## Global Entity Reference System (GERS)
The data types for each property in the Overture schema design do not map exactly to the permitted [data types in Parquet](https://parquet.apache.org/docs/file-format/types/) and [GeoParquet](https://geoparquet.org/releases/v1.1.0/). We release our datasets with the top-level properties encoded in this way:

All features in Overture have unique IDs called Overture IDs. For some feature types, the Overture ID is registered to GERS. This means a feature can be tracked from one Overture data release to another, and any changes to that feature can be encoded in a GERS changelog.
<details>
<summary>**GeoParquet columns for top-level Overture properties**</summary>
| column_name | column_type | description |
| --- | --- | --- |
| **id** | *string* | an Overture feature's unique id, part of the Global Entity Reference System (GERS) |
| **geometry** | *binary* | well-known binary (WKB) representation of the feature geometry |
| **bbox** | *struct\<xmin: float, xmax: float, ymin: float, ymax: float\>* | area defined by two longitudes and two latitudes: latitude is a decimal number between -90.0 and 90.0; longitude is a decimal number between -180.0 and 180.0. |
| **theme** | *string* | one of six Overture data themes |
| **type** | *string* | one of 14 Overture feature types |
| **version** | *int32* | version number of the feature, incremented in each Overture release where the geometry or attributes of this feature changed |
| **sources** | *list\<element: struct\<property: string, dataset: string, record_id: string, update_time: string, confidence: double\>\>* | array of source information for the properties of a given feature |
</details>

GERS also provides a mechanism to conflate datasets, matching one or more features via Overture IDs. For example, two polygon features from two different datasets, each polygon representing the footprint of the Empire State Building in New York City, can be easily matched if both features reference the same Overture ID in GERS.
### Other key schema properties
Most but not all of the feature types in the Overture schema require data for the `names`, `subtype`, and `class` properties. The `names` property is complex enough to have its own schema, which we describe in detail [here](/schema/concepts/names).

## Schema characteristics
### Properties that may be specific to a feature type
Some properties in the Overture schema are only populated with data for specific feature types. For example, the `place` feature type must include data for the `categories` property, as required by the schema. The `division_area` and `address` feature types require the `country` property to be populated with ISO 3166-1 alpha-2 country codes. The `segment` feature type in the transportation theme is the only feature type that includes data for a complex set of properties that describe roads. The [schema concepts](concepts) section of this documentation describes these schema complexities in detail.

### Core schema properties

Every feature in Overture has a core set of properties that are described in the schema. Overture features:
## Schema conventions
In addition to following the JSON and GeoJSON specifications, the Overture schema has its own style and conventions. The notations, nomenclatures, specifications, and standards we have adopted are described below.

- have a type
- have a geometry, where the type of geometry is constrained by the feature type
- are strongly-typed, _i.e._ the feature type constrains the geometry and properties
- have properties, which may include a core set of "flat" properties and additional properties with a nested structure
- have an ID property which is globally unique within the ID-space of the entire Overture data distribution version. For some feature types, the ID is registered with GERS
- may have custom user extension properties

### Schema notation conventions
### Notations

- snake case is used for all property names, string enumeration members, and string-valued enumeration equivalents
- boolean properties have a prefix verb "is" or "has" in a way that grammatically makes sense
Expand All @@ -62,10 +76,6 @@ Measurements of real-world objects and features follow [The International System

Quantities specified in regulatory rules, norms and customs follow local specifications wherever possible. In the schema, these values are provided as two-element arrays where the first element is the scalar numeric value and the second value is the units. Overture uses local units of measurement -- feet in the United States and meters in the EU, for example. The exact unit is confirmed in the specification of the property but is not repeated in the data.

### Relations

Low-cardinality directed relations are stored as ID references on the source feature.

### Regulations and restrictions

All quantities that relate to posted or ordnance regulations and restrictions are expressed in the same units as used in the regulation. The unit is explicitly included with the property in the data.
Expand All @@ -74,10 +84,9 @@ All quantities that relate to posted or ordnance regulations and restrictions ar

Opening hours and the time frame during which time dependent properties are applicable are indicated following the [OSM Opening Hours specification](https://wiki.openstreetmap.org/wiki/Key:opening_hours/specification).

<!-- This is not yet true
### Extensions

Overture allows for add hoc extensions beyond what is described in the schema. All extensions are prefixed with `ext_`. Extensions can be provided at the theme level, the type level, or the property level.
-->

## Data formats

While Overture describes data using a GeoJSON mental model, it distributes data as [GeoParquet](https://geoparquet.org/), a column-oriented format that is ideally suited for large geospatial datasets. This documentation includes many examples of how to work with data stored in GeoParquet files.
2 changes: 1 addition & 1 deletion docs/schema/concepts/by-theme/addresses/index.mdx
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: Addresses
title: Addresses schema concepts
draft: true
---

Expand Down
4 changes: 2 additions & 2 deletions docs/schema/concepts/by-theme/base/index.mdx
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: Base
title: Base schema concepts
---

import Tabs from '@theme/Tabs';
Expand All @@ -14,7 +14,7 @@ import OSMtoOvertureWater from '!!raw-loader!@site/src/queries/partials/osm_conv


## Overview
The Overture base theme includes additional features desired for rendering a complete basemap that are not yet associated with the global entity reference system (GERS), nor have they been through a rigorous schema definition process. Instead, we assign just a subtype and class to the feature and pass relevant attributes through in the `source_tags` attribute. Most of the features in the `base` theme come from OpenStreetMap via the [Daylight Map Distribution](https://daylightmap.org/).
The Overture base theme includes features desired for rendering a complete basemap: bathymetry, infrastructure, land, land cover, land use, and water. that are not yet associated with the Global Entity Reference System (GERS), nor have they been through a rigorous schema definition process. We assign a subtype and class to each feature and pass relevant properties through in the `source_tags` property. Most of the features in the `base` theme come from OpenStreetMap via the [Daylight Map Distribution](https://daylightmap.org/).

## Feature types
The base theme has five feature types.
Expand Down
2 changes: 1 addition & 1 deletion docs/schema/concepts/by-theme/buildings/index.mdx
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: Buildings
title: Buildings schema concepts
---
import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';
Expand Down
2 changes: 1 addition & 1 deletion docs/schema/concepts/by-theme/divisions/index.mdx
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: Divisions
title: Divisions schema concepts
draft: true
---

Expand Down
2 changes: 1 addition & 1 deletion docs/schema/concepts/by-theme/places/index.mdx
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: Places
title: Places schema concepts
---

import overture_categories from '!!raw-loader!./overture_categories.csv';
Expand Down
Loading

0 comments on commit edfd638

Please sign in to comment.