Geobuf is a compact binary geospatial format for lossless compression of GeoJSON and TopoJSON data.
Advantages over using GeoJSON and TopoJSON directly (in this revised version):
- Very compact: typically makes GeoJSON 6-8 times smaller and TopoJSON 2-3 times smaller.
- Smaller even when comparing gzipped sizes: 2-2.5x compression for GeoJSON and 20-30% for TopoJSON.
- Easy incremental parsing — you can get features out as you read them, without the need to build in-memory representation of the whole data.
- Partial reads — you can read only the parts you actually need, skipping the rest.
- Trivial concatenation: you can concatenate many Geobuf files together and they will form a valid combined Geobuf file.
- Potentially faster encoding/decoding compared to native JSON implementations (i.e. in Web browsers).
- Can still accommodate any GeoJSON and TopoJSON data, including extensions with arbitrary properties.
Think of this as an attempt to design a simple, modern Shapefile successor that works seamlessly with GeoJSON and TopoJSON.
Unlike Mapbox Vector Tiles, it aims for lossless compression of datasets — without tiling, projecting coordinates, flattening geometries or stripping properties.
This repository is the first encoding/decoding implementation of this new major version of Geobuf (in Python). It serves as a prototyping playground, with faster implementations in JS and C++ coming in future.
| normal | gzipped
------------------- | --------- | -------- us-zips.json | 101.85 MB | 26.67 MB us-zips.pbf | 12.24 MB | 10.48 MB us-zips.topo.json | 15.02 MB | 3.19 MB us-zips.topo.pbf | 4.85 MB | 2.72 MB idaho.json | 10.92 MB | 2.57 MB idaho.pbf | 1.37 MB | 1.17 MB idaho.topo.json | 1.9 MB | 612 KB idaho.topo.pbf | 567 KB | 479 KB
Command line:
geobuf encode < example.json > example.pbf
geobuf decode < example.pbf > example.pbf.json
As a module:
import geobuf
pbf = geobuf.encode(my_json) # GeoJSON or TopoJSON -> Geobuf string
my_json = geobuf.decode(pbf) # Geobuf string -> GeoJSON or TopoJSON
The encode
function accepts a dict-like object, for example the result of json.loads(json_str)
.
Both encode.py
and geobuf.encode
accept two optional arguments:
- precision — max number of digits after the decimal point in coordinates,
6
by default. - dimensions — number of dimensions in coordinates,
2
by default.
py.test -v
The tests run through all .json
files in the fixtures
directory,
comparing each original GeoJSON with an encoded/decoded one.