Skip to content

Commit

Permalink
V3 casl 530 (#355)
Browse files Browse the repository at this point in the history
* Fix JBON encoding/decoding of tags, fix JBON documentation, remove long deprecated structs.

* Adjust tags encoding to use the same as feature.

* Partition by store_number, which does not require md5 hashing (we anyway do it in the client).

* Move deprecated (not needed) code into naksha_sql_fixme.

* Fix naskha_tags.

---------

Co-authored-by: Alexander Lowey-Weber <[email protected]>
  • Loading branch information
xeus2001 and Alexander Lowey-Weber authored Oct 2, 2024
1 parent a5f8681 commit 473e58f
Show file tree
Hide file tree
Showing 19 changed files with 765 additions and 1,210 deletions.
19 changes: 18 additions & 1 deletion browser_test.html
Original file line number Diff line number Diff line change
Expand Up @@ -44,9 +44,21 @@
const { PgUtil } = require("naksha_psql");
const { JbEncoder, JbDecoder, JbFeatureDecoder } = require("naksha_jbon");

function encodeTags(json) {
if (!json) json = '{"name":"foo","age":48}'
let raw = json;
if (typeof raw === "string") raw = Platform.fromJSON(json);
let klass = Platform.klassFor(base.AnyObject)
let map = Platform.proxy(raw, klass)
let enc = new JbEncoder();
enc.encodeMap(map, false);
let f = enc.buildFeature(null, jbon.FEATURE_VARIANT_TAGS)
return f
}
function encodeFeature(json) {
if (!json) json = '{"id":"foo","properties":{"name":"Bar"}}'
let raw = Platform.fromJSON(json)
let raw = json;
if (typeof raw === "string") raw = Platform.fromJSON(json);
let klass = Platform.klassFor(base.AnyObject)
let map = Platform.proxy(raw, klass)
let enc = new JbEncoder();
Expand All @@ -58,6 +70,11 @@
dec.mapBytes(f)
return dec.toAnyObject()
}
function decodeTags(f) {
let dec = new JbFeatureDecoder()
dec.mapBytes(f)
return dec.toMap()
}
</script>
</head>
<body>
Expand Down
97 changes: 29 additions & 68 deletions docs/JBON.md
Original file line number Diff line number Diff line change
Expand Up @@ -72,16 +72,16 @@ All units start with a **lead-in** byte, which describes the type of the unit. T
- If the size is not embedded (61-63), then followed by 1, 2 or 4 byte unsigned biased integer (biased by 60), big-endian encoded.
- `11`: struct
- `11ss_vvtt` = Struct (ss: 0=empty, 1=uint8, 2=uint16, 3=uint32)
- Followed by one byte, two byte or four byte unsigned content size, big-endian encoded.
- If standard structure (vv=0, variant=null)
- If not empty, followed by one byte, two byte or four byte unsigned integer storing the content size, big-endian encoded.
- If structure without variant (vv=0, variant=null)
- `0`: Array
- `1`: Map
- `2`: Dictionary
- `3`: Reserved
- If variant structure (vv: 1=byte, 2=short, 3=int)
- Followed by one byte, two byte or four byte integer storing the variant, big-endian encoded.
- `0`: Record
- `1`: XYZ
- If structure with variant (vv: 1=byte, 2=short, 3=int)
- Followed by one byte, two byte or four byte unsigned integer storing the variant, big-endian encoded.
- `0`: Feature
- `1`: Naksha
- `2`: Custom
- `3`: Reserved

Expand Down Expand Up @@ -145,11 +145,14 @@ The code-points are variable encoded. The leading byte of every code-point signa
**Note**: The `ss`-bits improve the compression greatly, because the encoder will split strings by default at a space or underscore. Exactly where these splits happen, we do not need to encode the separator characters. The reason to cut at these two characters is that most often street-names or other human text uses the space as separator, while for constants in programming most often the underscore is used as separator (TYPE_A, TYPE_B, ...). Additionally, we have room for one more split characters to be defined by experience in the future.

## Structures
All other special types are structures. The header stores the outer size of the structure, so the bytes following the **header**.
The header stores the outer size of the structure, so the bytes following the **header**.

Note that there are two basic kind of structures. Those with a subtype (variant) and those without. The first 8 structure types are without variant, the last 8 are with variant. The variant is encoded as integer directly after the structure header. The variant is used to define subtypes for structures to relax the namespace, because there are actually only 16 generic structure types available.
Note that there are two basic kind of structures. Those with a subtype (variant) and those without. The first 4 structure types are without variant, the last 4 are with variant. The variant is encoded as integer directly after the structure header. The variant is used to define subtypes for structures to relax the namespace, because there are actually only 8 generic structure types available.

For this reason the JBON specification defines one custom variant, that is shared by all users of JBON, and should be used to encode arbitrary (application specific) structures. This allows applications to define their own structures and allows to define up to 2 billion own custom structure.
For this reason the JBON specification defines one custom structure, that is shared by all users of JBON, and should be used to encode arbitrary (application specific) structures. This allows applications to define their own structures and allows to define up to 2 billion own custom structure.

## Structures without Variant
These structures are normally only within variant structures, because do not have any special header that allows to link them to dictionaries.

### Array (0)
The array is just a sequence of units encoded.
Expand All @@ -171,72 +174,30 @@ After the header, the content follows. The content is simply a sequence of units

### Reserved (3)

### Record (0+variant)
A JBON record is a container for any other JBON unit. It is mainly used to link the embedded unit to a dedicated global and local dictionary. The format looks like:
## Structures with Variant (_subtype_)
These structures are final ones, they combine dictionary and headers with actual content. Normally, all internal structures and values are embedded into one of these structures.

### Feature (_0 + variant_)
A JBON feature is a container for any other JBON unit. It is mainly used to link the embedded unit to a dedicated global and local dictionary. The format looks like:

- The **id** of the global dictionary to be used (**string**), can be _null_.
- The **id** of the record, **string**, **text** or _null_.
- The **id** of the feature, **string**, **text** or _null_.
- The embedded local dictionary.
- The embedded JBON object (the root object).

A record can't create references to other records, only into a global dictionaries with unique identifiers. From an encoder perspective this is all.

### Xyz (1+variant)
This type is reserved for XYZ interactions. It is a flat object, optimized to be very small, with the following layout:

- Variant as **integer** (either **int5**, **int8**, **int16** or **int32**).
- ... content dependent on the variant

#### XyzNs (variant:0)
The information that the database manages and what is delivered by the database.

- **createdAt** (timestamp)
- **updatedAt** (timestamp) - _null_, if being the same as **createdAt**
- **txn** (BigInt64)
- **action** (integer), constants for CREATE (0), UPDATE (1) and DELETE (2)
- **version** (integer)
- **author_ts** (timestamp) - _null_, if being the same as **updatedAt**, which can be the same as **createdAt**
- **extend** (double)
- **puuid** (string or _null_)
- **uuid** (string)
- **app_id** (string)
- **author** (string)
- **grid** (string) SELECT ST_GeoHash(ST_Centroid(geo),7);

Notes: Tags are now a dedicated map, but when exposed, they are joined by an equal sign, the _null_ is default and causes the equal sign to disappear. So the tag "foo" becomes "tag=null" and when converting back "tag=null" is converted into "tag". Any other value, not being _null_, will be encoded into the tag. We do not allow equal signs otherwise, so only one equal sign is allowed in a tag. We do this, because we add an GIN index on the tags and allows key-value search at low level.

#### XyzOp (variant:1)
The information that clients should send to the database to write records or collections. This has to be provided together with a new record.

- **op** (integer) - The requested operation (CREATE, UPDATE, UPSERT, DELETE or PURGE).
- **id** (string) - The record-id.
- **uuid** (string or _null_) - If not _null_, then the operation is atomic and the state must be this one (only UPDATE, DELETE and PURGE).
- **grid** (string or _null_) - If the geo-reference-id is calculated by the client.

#### XyzTags (variant:2)
The tags, basically just a normal JBON map, but the values must only be **null**, **boolean**, **string** or **float64**. The map is preceded by the **id** of the global dictionary to be used, can be **null**, so actually being:

- **id** (string or _null_) of the global dictionary to use.
- Now the tags follow, split into a key and value part:
- **string** or **string-reference** - The key or reference to the key to index.
- **null**, **boolean**, **string**, **string-reference**, **integer** or **float**. If an integer is stored, it must be exposed as floating point number.

Tags do not support integers directly, but as floating pointer numbers support up to 53-bit precision with integer values, a limited amount of integer support is available.

**Note**: Externally _tags_ are only arrays of strings, therefore to convert external to internal representation the equal sign is used to split individual tag-strings. If a colon is set in-front of the equal sign, a value conversion is done, so _"foo=12"_ results in the value being a string "12", while _"foo:=12"_ results in a value being a floating point number _12_. Please read more about tags in the [documentation](../docs/TAGS.md).

#### XyzTxDetails (variant:3, draft)
Details about a transaction:

- **collections** - A map where the key is the collection identifier and the value is an integer bit-mask with what happened.

### Custom-Variant (2+variant)
An undefined type that any application can use for internal binary encodings. It is a flat object, optimized to be very small, with the following layout:
A feature can't create references to other features, only into a global dictionaries with unique identifiers. From an encoder perspective this is all.

- Variant as **integer** (either **int5**, **int8**, **int16** or **int32**).
- ... content dependent on the variant
#### Feature Variants
- `0`: Unknown variant.
- `1`: GeoJSON feature.
- `2`: Naksha Tags.
- `3 .. 1048575`: Reserved for Naksha (up to 2^24).
- `1048576 .. 4294967295`: Available for custom variants.

### Reserved (3+variant)
### Reserved (_1 + variant_)
### Reserved (_2 + variant_)
### Custom-Variant (_3 + variant_)
An undefined type that any application can use for internal binary encodings.

## Extended Proposals
Encoding a GeoJSON position. This is a proposal for a complex value that persists out of longitude, latitude and an optional altitude.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,6 @@ open class JbDecoder {
TYPE_MAP -> "struct-map"
TYPE_DICTIONARY -> "struct-dictionary"
TYPE_FEATURE -> "struct-feature"
TYPE_XYZ -> "struct-xyz"
TYPE_CUSTOM -> "struct-custom"
else -> "undefined"
}
Expand Down Expand Up @@ -853,12 +852,6 @@ open class JbDecoder {
*/
fun isArray(): Boolean = unitType() == TYPE_ARRAY

/**
* Test if the current offset is at the lead-in of an XYZ special.
* @return true if the current offset is at the lead-in of an XYZ special; false otherwise.
*/
fun isXyz(): Boolean = unitType() == TYPE_XYZ

/**
* Read the current unit.
* @return _null_, [Boolean], [Int], [Int64], [Double], [String], [AnyObject] or [Array].
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -835,7 +835,6 @@ open class JbEncoder(var global: JbDictionary? = null) : Binary() {
* @return The JBON representation of the feature, the XYZ-namespace and the geometry.
*/
fun buildFeatureFromMap(map: MapProxy<String, *>): ByteArray {
// TODO: Make the ignore configurable!
clear()
val id: String? = map.getAs("id", String::class)
xyz = null
Expand All @@ -853,7 +852,7 @@ open class JbEncoder(var global: JbDictionary? = null) : Binary() {
}
}
endMap(start)
return buildFeature(id)
return buildFeature(id, FEATURE_VARIANT_GEO_JSON)
}

/**
Expand Down Expand Up @@ -938,12 +937,12 @@ open class JbEncoder(var global: JbDictionary? = null) : Binary() {
}

/**
* Creates a feature out of this builder and the current local dictionary.
* Creates a feature out of this builder, and the current local dictionary.
* @param id The unique identifier of the feature, may be null.
* @param variant The variant to write.
* @return The feature.
*/
fun buildFeature(id: String? = null, variant: Int = 0): ByteArray {
fun buildFeature(id: String? = null, variant: Int = FEATURE_VARIANT_UNKNOWN): ByteArray {
// The content (was already written).
val startOfFeaturePayload = 0
check(end > 0) { "Can't build empty feature" }
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -139,7 +139,7 @@ open class JbFeatureDecoder(dictManager: IDictManager? = null) : JbRecordDecoder
*/
fun toMap(): AnyObject {
val feature = _map.toAnyObject()
if ("id" in feature) feature.setRaw("id", id())
if (id() != null && "id" !in feature) feature.setRaw("id", id())
return feature
}

Expand Down
12 changes: 8 additions & 4 deletions here-naksha-lib-jbon/src/commonMain/kotlin/naksha/jbon/Static.kt
Original file line number Diff line number Diff line change
Expand Up @@ -91,12 +91,16 @@ const val TYPE_STRING = CLASS_STRING or 0b0000
const val TYPE_ARRAY = CLASS_STRUCT or 0b0000
const val TYPE_MAP = CLASS_STRUCT or 0b0001
const val TYPE_DICTIONARY = CLASS_STRUCT or 0b0010
const val TYPE_RESERVED1 = CLASS_STRUCT or 0b0111
const val TYPE_RESERVED = CLASS_STRUCT or 0b0111
// with variant
const val TYPE_FEATURE = CLASS_STRUCT or 0b0100
const val TYPE_XYZ = CLASS_STRUCT or 0b0101
const val TYPE_CUSTOM = CLASS_STRUCT or 0b0110
const val TYPE_RESERVED2 = CLASS_STRUCT or 0b0111
const val TYPE_RESERVED1 = CLASS_STRUCT or 0b0101
const val TYPE_RESERVED2 = CLASS_STRUCT or 0b0110
const val TYPE_CUSTOM = CLASS_STRUCT or 0b0111

const val FEATURE_VARIANT_UNKNOWN = 0
const val FEATURE_VARIANT_GEO_JSON = 1
const val FEATURE_VARIANT_TAGS = 2

/**
* A special type returned when the offset in a reader is invalid or for any other error.
Expand Down
Loading

0 comments on commit 473e58f

Please sign in to comment.