-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
CSHARP-5202: BSON Binary Vector Subtype Support
- Loading branch information
Showing
25 changed files
with
2,187 additions
and
129 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,58 @@ | ||
# Testing Binary subtype 9: Vector | ||
|
||
The JSON files in this directory tree are platform-independent tests that drivers can use to prove their conformance to | ||
the specification. | ||
|
||
These tests focus on the roundtrip of the list of numbers as input/output, along with their data type and byte padding. | ||
|
||
Additional tests exist in `bson_corpus/tests/binary.json` but do not sufficiently test the end-to-end process of Vector | ||
to BSON. For this reason, drivers must create a bespoke test runner for the vector subtype. | ||
|
||
## Format | ||
|
||
The test data corpus consists of a JSON file for each data type (dtype). Each file contains a number of test cases, | ||
under the top-level key "tests". Each test case pertains to a single vector. The keys provide the specification of the | ||
vector. Valid cases also include the Canonical BSON format of a document {test_key: binary}. The "test_key" is common, | ||
and specified at the top level. | ||
|
||
#### Top level keys | ||
|
||
Each JSON file contains three top-level keys. | ||
|
||
- `description`: human-readable description of what is in the file | ||
- `test_key`: name used for key when encoding/decoding a BSON document containing the single BSON Binary for the test | ||
case. Applies to *every* case. | ||
- `tests`: array of test case objects, each of which have the following keys. Valid cases will also contain additional | ||
binary and json encoding values. | ||
|
||
#### Keys of individual tests cases | ||
|
||
- `description`: string describing the test. | ||
- `valid`: boolean indicating if the vector, dtype, and padding should be considered a valid input. | ||
- `vector`: list of numbers | ||
- `dtype_hex`: string defining the data type in hex (e.g. "0x10", "0x27") | ||
- `dtype_alias`: (optional) string defining the data dtype, perhaps as Enum. | ||
- `padding`: (optional) integer for byte padding. Defaults to 0. | ||
- `canonical_bson`: (required if valid is true) an (uppercase) big-endian hex representation of a BSON byte string. | ||
|
||
## Required tests | ||
|
||
#### To prove correct in a valid case (`valid: true`), one MUST | ||
|
||
- encode a document from the numeric values, dtype, and padding, along with the "test_key", and assert this matches the | ||
canonical_bson string. | ||
- decode the canonical_bson into its binary form, and then assert that the numeric values, dtype, and padding all match | ||
those provided in the JSON. | ||
|
||
Note: For floating point number types, exact numerical matches may not be possible. Drivers that natively support the | ||
floating-point type being tested (e.g., when testing float32 vector values in a driver that natively supports float32), | ||
MUST assert that the input float array is the same after encoding and decoding. | ||
|
||
#### To prove correct in an invalid case (`valid:false`), one MUST | ||
|
||
- raise an exception when attempting to encode a document from the numeric values, dtype, and padding. | ||
|
||
## FAQ | ||
|
||
- What MongoDB Server version does this apply to? | ||
- Files in the "specifications" repository have no version scheme. They are not tied to a MongoDB server version. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,51 @@ | ||
{ | ||
"description": "Tests of Binary subtype 9, Vectors, with dtype FLOAT32", | ||
"test_key": "vector", | ||
"tests": [ | ||
{ | ||
"description": "Simple Vector FLOAT32", | ||
"valid": true, | ||
"vector": [127.0, 7.0], | ||
"dtype_hex": "0x27", | ||
"dtype_alias": "FLOAT32", | ||
"padding": 0, | ||
"canonical_bson": "1C00000005766563746F72000A0000000927000000FE420000E04000" | ||
}, | ||
{ | ||
"description": "Vector with decimals and negative value FLOAT32", | ||
"valid": true, | ||
"vector": [127.7, -7.7], | ||
"dtype_hex": "0x27", | ||
"dtype_alias": "FLOAT32", | ||
"padding": 0, | ||
"canonical_bson": "1C00000005766563746F72000A0000000927006666FF426666F6C000" | ||
}, | ||
{ | ||
"description": "Empty Vector FLOAT32", | ||
"valid": true, | ||
"vector": [], | ||
"dtype_hex": "0x27", | ||
"dtype_alias": "FLOAT32", | ||
"padding": 0, | ||
"canonical_bson": "1400000005766563746F72000200000009270000" | ||
}, | ||
{ | ||
"description": "Infinity Vector FLOAT32", | ||
"valid": true, | ||
"vector": ["-inf", 0.0, "inf"], | ||
"dtype_hex": "0x27", | ||
"dtype_alias": "FLOAT32", | ||
"padding": 0, | ||
"canonical_bson": "2000000005766563746F72000E000000092700000080FF000000000000807F00" | ||
}, | ||
{ | ||
"description": "FLOAT32 with padding", | ||
"valid": false, | ||
"vector": [127.0, 7.0], | ||
"dtype_hex": "0x27", | ||
"dtype_alias": "FLOAT32", | ||
"padding": 3 | ||
} | ||
] | ||
} | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,57 @@ | ||
{ | ||
"description": "Tests of Binary subtype 9, Vectors, with dtype INT8", | ||
"test_key": "vector", | ||
"tests": [ | ||
{ | ||
"description": "Simple Vector INT8", | ||
"valid": true, | ||
"vector": [127, 7], | ||
"dtype_hex": "0x03", | ||
"dtype_alias": "INT8", | ||
"padding": 0, | ||
"canonical_bson": "1600000005766563746F7200040000000903007F0700" | ||
}, | ||
{ | ||
"description": "Empty Vector INT8", | ||
"valid": true, | ||
"vector": [], | ||
"dtype_hex": "0x03", | ||
"dtype_alias": "INT8", | ||
"padding": 0, | ||
"canonical_bson": "1400000005766563746F72000200000009030000" | ||
}, | ||
{ | ||
"description": "Overflow Vector INT8", | ||
"valid": false, | ||
"vector": [128], | ||
"dtype_hex": "0x03", | ||
"dtype_alias": "INT8", | ||
"padding": 0 | ||
}, | ||
{ | ||
"description": "Underflow Vector INT8", | ||
"valid": false, | ||
"vector": [-129], | ||
"dtype_hex": "0x03", | ||
"dtype_alias": "INT8", | ||
"padding": 0 | ||
}, | ||
{ | ||
"description": "INT8 with padding", | ||
"valid": false, | ||
"vector": [127, 7], | ||
"dtype_hex": "0x03", | ||
"dtype_alias": "INT8", | ||
"padding": 3 | ||
}, | ||
{ | ||
"description": "INT8 with float inputs", | ||
"valid": false, | ||
"vector": [127.77, 7.77], | ||
"dtype_hex": "0x03", | ||
"dtype_alias": "INT8", | ||
"padding": 0 | ||
} | ||
] | ||
} | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,98 @@ | ||
{ | ||
"description": "Tests of Binary subtype 9, Vectors, with dtype PACKED_BIT", | ||
"test_key": "vector", | ||
"tests": [ | ||
{ | ||
"description": "Padding specified with no vector data PACKED_BIT", | ||
"valid": false, | ||
"vector": [], | ||
"dtype_hex": "0x10", | ||
"dtype_alias": "PACKED_BIT", | ||
"padding": 1 | ||
}, | ||
{ | ||
"description": "Simple Vector PACKED_BIT", | ||
"valid": true, | ||
"vector": [127, 7], | ||
"dtype_hex": "0x10", | ||
"dtype_alias": "PACKED_BIT", | ||
"padding": 0, | ||
"canonical_bson": "1600000005766563746F7200040000000910007F0700" | ||
}, | ||
{ | ||
"description": "Empty Vector PACKED_BIT", | ||
"valid": true, | ||
"vector": [], | ||
"dtype_hex": "0x10", | ||
"dtype_alias": "PACKED_BIT", | ||
"padding": 0, | ||
"canonical_bson": "1400000005766563746F72000200000009100000" | ||
}, | ||
{ | ||
"description": "PACKED_BIT with padding", | ||
"valid": true, | ||
"vector": [127, 7], | ||
"dtype_hex": "0x10", | ||
"dtype_alias": "PACKED_BIT", | ||
"padding": 3, | ||
"canonical_bson": "1600000005766563746F7200040000000910037F0700" | ||
}, | ||
{ | ||
"description": "Overflow Vector PACKED_BIT", | ||
"valid": false, | ||
"vector": [256], | ||
"dtype_hex": "0x10", | ||
"dtype_alias": "PACKED_BIT", | ||
"padding": 0 | ||
}, | ||
{ | ||
"description": "Underflow Vector PACKED_BIT", | ||
"valid": false, | ||
"vector": [-1], | ||
"dtype_hex": "0x10", | ||
"dtype_alias": "PACKED_BIT", | ||
"padding": 0 | ||
}, | ||
{ | ||
"description": "Vector with float values PACKED_BIT", | ||
"valid": false, | ||
"vector": [127.5], | ||
"dtype_hex": "0x10", | ||
"dtype_alias": "PACKED_BIT", | ||
"padding": 0 | ||
}, | ||
{ | ||
"description": "Padding specified with no vector data PACKED_BIT", | ||
"valid": false, | ||
"vector": [], | ||
"dtype_hex": "0x10", | ||
"dtype_alias": "PACKED_BIT", | ||
"padding": 1 | ||
}, | ||
{ | ||
"description": "Exceeding maximum padding PACKED_BIT", | ||
"valid": false, | ||
"vector": [1], | ||
"dtype_hex": "0x10", | ||
"dtype_alias": "PACKED_BIT", | ||
"padding": 8 | ||
}, | ||
{ | ||
"description": "Negative padding PACKED_BIT", | ||
"valid": false, | ||
"vector": [1], | ||
"dtype_hex": "0x10", | ||
"dtype_alias": "PACKED_BIT", | ||
"padding": -1 | ||
}, | ||
{ | ||
"description": "Vector with float values PACKED_BIT", | ||
"valid": false, | ||
"vector": [127.5], | ||
"dtype_hex": "0x10", | ||
"dtype_alias": "PACKED_BIT", | ||
"padding": 0 | ||
} | ||
] | ||
} | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.