Validate client-submitted data using JSON Schema documents and convert JSON Schema documents into different data-interchange formats.
- Installation
- Usage
- Data Validation
- Data Validation CLI
- Data Validation API
- Structured Messaged Generation
- Supported Data-Interchange Formats
- Avro
- Data-Interchange CLI
- Data-Interchange API
- Testing
- Additional Resources
- Future Considerations
- Maintainers
- Contributing
- License
- Validate client-submitted data
- Convert JSON Schema documents into different data-interchange formats
- Simple syntax
- CLI support for data validation and JSON Schema conversion
- Stop Being a "Janitorial" Data Scientist
via pip
$ pip install aptos
via git
$ git clone https://github.com/pennsignals/aptos.git && cd aptos
$ python setup.py install
aptos
supports the following capabilities:
- Data Validation: Validate client-submitted data using validation keywords described in the JSON Schema specification.
- Schema Conversion: Convert JSON Schema documents into different data-interchange formats. See the list of supported data-interchange formats for more information.
usage: aptos [arguments] SCHEMA
aptos is a tool for validating client-submitted data using the JSON Schema
vocabulary and converts JSON Schema documents into different data-interchange
formats.
positional arguments:
schema JSON document containing the description
optional arguments:
-h, --help show this help message and exit
Arguments:
{validate,convert}
validate Validate a JSON instance
convert Convert a JSON Schema into a different data-interchange
format
More information on JSON Schema: http://json-schema.org/
Here is a basic example of a JSON Schema:
{
"title": "Person",
"type": "object",
"properties": {
"firstName": {
"type": "string"
},
"lastName": {
"type": "string"
},
"age": {
"description": "Age in years",
"type": "integer",
"minimum": 0
}
},
"required": ["firstName", "lastName"]
}
Given a JSON Schema, aptos
can validate client-submitted data to ensure that it satisfies a certain number of criteria.
JSON Schema Validation keywords such as minimum
and required
can be used to impose requirements for successful validation of an instance. In the JSON Schema above, both the firstName
and lastName
properties are required, and the age
property MUST have a value greater than or equal to 0.
Valid Instance ✔️ | Invalid Instance ✖️ |
---|---|
{"firstName": "John", "lastName": "Doe", "age": 42} |
{"firstName": "John", "age": -15} (missing required property lastName and age is not greater than or equal to 0) |
aptos
can validate client-submitted data using either the CLI or the API:
$ aptos validate -instance INSTANCE SCHEMA
Arguments:
- INSTANCE: JSON document being validated
- SCHEMA: JSON document containing the description
Example - macOS:
$ aptos validate -instance '{"firstName": "John"}' person.json
Example - Windows:
> aptos validate -instance "{\"firstName\": \"John\"}" person.json
Successful Validation ✔️ | Unsuccessful Validation ✖️ |
---|---|
import json
from aptos.parser import SchemaParser
from aptos.visitor import ValidationVisitor
with open('/path/to/schema') as fp:
schema = json.load(fp)
component = SchemaParser.parse(schema)
# Invalid client-submitted data (instance)
instance = {
'firstName': 'John'
}
try:
component.accept(ValidationVisitor(instance))
except AssertionError as e:
print(e) # instance {'firstName': 'John'} is missing required property 'lastName'
Given a JSON Schema, aptos
can generate different structured messages.
Format | Supported | Notes |
---|---|---|
Apache Avro | ✔️ | |
Protocol Buffers | ✖️ | Planned for future releases |
Apache Thrift | ✖️ | Planned for future releases |
Apache Parquet | ✖️ | Planned for future releases |
Using the Person
schema in the previous example, aptos
can convert the schema into the Avro data-interchange format using either the CLI or the API.
aptos
maps the following JSON schema types to Avro types:
JSON Schema Type | Avro Type |
---|---|
string |
string |
boolean |
boolean |
null |
null |
integer |
long |
number |
double |
object |
record |
array |
array |
JSON Schema documents containing the
enum
validation keyword are mapped to Avroenum
symbols
attribute.
JSON Schema documents with the
type
keyword as an array are mapped to Avro Union types.
$ aptos convert -format FORMAT SCHEMA
Arguments:
- FORMAT: Data-interchange format
- SCHEMA: JSON document containing the description
import json
from aptos.parser import SchemaParser
from aptos.schema.visitor import AvroSchemaVisitor
with open('/path/to/schema') as fp:
schema = json.load(fp)
component = SchemaParser.parse(schema)
record = component.accept(AvroSchemaVisitor())
print(json.dumps(record, indent=2))
The above code generates the following Avro schema:
{
"type": "record",
"fields": [
{
"doc": "",
"type": "string",
"name": "lastName"
},
{
"doc": "",
"type": "string",
"name": "firstName"
},
{
"doc": "Age in years",
"type": "long",
"name": "age"
}
],
"name": "Person"
}
All unit tests exist in the tests directory.
To run tests, execute the following command:
$ python setup.py test
- Stop Being a "Janitorial" Data Scientist - A blog post explaining why aptos was created
- Understanding JSON Schema - An excellent guide for schema authors, from the Space Telescope Science Institute
- Swagger support
- Additional data-interchange formats
Jason Walsh |
Contributions welcome! Please read the contributing.json
file first.
Join our Slack channel!