Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for BSON element 11: uint64 #4380

Open
wants to merge 15 commits into
base: develop
Choose a base branch
from
Open
8 changes: 4 additions & 4 deletions docs/mkdocs/docs/features/binary_formats/bson.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ representation of data types that are not part of the JSON spec. For example, BS

- [BSON Website](http://bsonspec.org) - the main source on BSON
- [BSON Specification](http://bsonspec.org/spec.html) - the specification


## Serialization

Expand Down Expand Up @@ -43,7 +43,7 @@ The library uses the following mapping from JSON values types to BSON types:
```cpp
--8<-- "examples/to_bson.cpp"
```

Output:

```c
Expand Down Expand Up @@ -97,5 +97,5 @@ The library maps BSON record types to JSON value types as follows:

!!! note "Handling of BSON type 0x11"

BSON type 0x11 is used to represent uint64 numbers. This library treats these values purely as uint64 numbers
and does not parse them into date-related formats.
BSON type 0x11 is used to represent uint64 numbers. This library treats these values purely as uint64 numbers
and does not parse them into date-related formats.
24 changes: 12 additions & 12 deletions docs/mkdocs/docs/home/exceptions.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,15 +47,15 @@ Note that [`JSON_THROW_USER`](../api/macros/json_throw_user.md) should leave the

```cpp
#include <iostream>

#define JSON_TRY_USER if(true)
#define JSON_CATCH_USER(exception) if(false)
#define JSON_THROW_USER(exception) \
{std::clog << "Error in " << __FILE__ << ":" << __LINE__ \
<< " (function " << __FUNCTION__ << ") - " \
<< (exception).what() << std::endl; \
std::abort();}

#include <nlohmann/json.hpp>
```

Expand All @@ -72,7 +72,7 @@ Exceptions in the library are thrown in the local context of the JSON value they
```cpp
--8<-- "examples/diagnostics_standard.cpp"
```

Output:

```
Expand All @@ -90,7 +90,7 @@ As this global context comes at the price of storing one additional pointer per
```cpp
--8<-- "examples/diagnostics_extended.cpp"
```

Output:

```
Expand Down Expand Up @@ -125,7 +125,7 @@ Exceptions have ids 1xx.
```cpp
--8<-- "examples/parse_error.cpp"
```

Output:

```
Expand Down Expand Up @@ -377,7 +377,7 @@ Exceptions have ids 2xx.
```cpp
--8<-- "examples/invalid_iterator.cpp"
```

Output:

```
Expand Down Expand Up @@ -541,7 +541,7 @@ Exceptions have ids 3xx.
```cpp
--8<-- "examples/type_error.cpp"
```

Output:

```
Expand Down Expand Up @@ -735,7 +735,7 @@ The `dump()` function only works with UTF-8 encoded strings; that is, if you ass

- Store the source file with UTF-8 encoding.
- Pass an error handler as last parameter to the `dump()` function to avoid this exception:
- `json::error_handler_t::replace` will replace invalid bytes sequences with `U+FFFD`
- `json::error_handler_t::replace` will replace invalid bytes sequences with `U+FFFD`
- `json::error_handler_t::ignore` will silently ignore invalid byte sequences

### json.exception.type_error.317
Expand Down Expand Up @@ -770,7 +770,7 @@ Exceptions have ids 4xx.
```cpp
--8<-- "examples/out_of_range.cpp"
```

Output:

```
Expand Down Expand Up @@ -839,7 +839,7 @@ A parsed number could not be stored as without changing it to NaN or INF.

### json.exception.out_of_range.407

UBJSON only support integer numbers up to 9223372036854775807.
UBJSON only supports integer numbers up to 9223372036854775807.

!!! failure "Example message"

Expand All @@ -849,7 +849,7 @@ UBJSON only support integer numbers up to 9223372036854775807.

!!! note

Since version 3.9.0, integer numbers beyond int64 are serialized as high-precision UBJSON numbers, and this exception does not further occur.
Since version 3.9.0, integer numbers beyond int64 are serialized as high-precision UBJSON numbers, and this exception does not further occur.

### json.exception.out_of_range.408

Expand Down Expand Up @@ -885,7 +885,7 @@ Exceptions have ids 5xx.
```cpp
--8<-- "examples/other_error.cpp"
```

Output:

```
Expand Down
11 changes: 6 additions & 5 deletions include/nlohmann/detail/input/binary_reader.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -322,17 +322,18 @@ class binary_reader
return get_number<std::int32_t, true>(input_format_t::bson, value) && sax->number_integer(value);
}

case 0x11: // uint64
{
std::uint64_t value{};
return get_number<std::uint64_t, true>(input_format_t::bson, value) && sax->number_unsigned(value);
}

case 0x12: // int64
{
std::int64_t value{};
return get_number<std::int64_t, true>(input_format_t::bson, value) && sax->number_integer(value);
}

case 0x11: // uint64
{
std::uint64_t value{};
return get_number<std::uint64_t, true>(input_format_t::bson, value) && sax->number_unsigned(value);
}

default: // anything else not supported (yet)
{
Expand Down
11 changes: 5 additions & 6 deletions include/nlohmann/detail/output/binary_writer.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -1076,7 +1076,7 @@ class binary_writer
{
return (value <= static_cast<std::uint64_t>((std::numeric_limits<std::int32_t>::max)()))
? sizeof(std::int32_t)
: sizeof(std::int64_t);
: sizeof(std::uint64_t);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change seems reasonable.

}

/*!
Expand All @@ -1090,15 +1090,14 @@ class binary_writer
write_bson_entry_header(name, 0x10 /* int32 */);
write_number<std::int32_t>(static_cast<std::int32_t>(j.m_data.m_value.number_unsigned), true);
}
else if (j.m_data.m_value.number_unsigned <= static_cast<std::uint64_t>((std::numeric_limits<std::int64_t>::max)()))
else if (j.m_data.m_value.number_unsigned <= std::numeric_limits<std::uint64_t>::max())
slowriot marked this conversation as resolved.
Show resolved Hide resolved
{
write_bson_entry_header(name, 0x12 /* int64 */);
write_number<std::int64_t>(static_cast<std::int64_t>(j.m_data.m_value.number_unsigned), true);
write_bson_entry_header(name, 0x11 /* uint64 */);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change seems reasonable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is logical, but I have some concerns about backward compatibility. It may break the following scenario:

void write()
{
    const uint64_t = 9223372036854775807L;
    json const j = {
            {"entry", l}
    };
    const std::vector<uint8_t> bson = json::to_bson(j);
    saveToTable1(bson);
    saveToTable2(bson);
}

write(); // was called before the code changes in this PR
void read()
{
    const std::vector<uint8_t> bson1 = loadFromTable1();
    json const j = json::from_bson(bson);
    const std::vector<uint8_t> bson1_roundtrip = json::to_bson(j);

    const std::vector<uint8_t> bson2 = loadFromTable2();

    if (equals(bson1_roundtrip, bson2)) { 
        ...
    }
}

read(); 

With the changes in this PR, the comparison between bson1_roundtrip and bson2 will fail. It is hard to tell if clients rely on this behavior, but I would like to highlight this potential issue.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For compatibility, we could serialize unsigned integers until int64_max with 0x12 and all numbers larger with 0x11.

write_number<std::uint64_t>(static_cast<std::uint64_t>(j.m_data.m_value.number_unsigned), true);
}
else
{
write_bson_entry_header(name, 0x11 /* uint64 */);
write_number<std::uint64_t>(static_cast<std::uint64_t>(j.m_data.m_value.number_unsigned), true);
JSON_THROW(out_of_range::create(407, concat("unsigned integer number ", std::to_string(j.m_data.m_value.number_unsigned), " cannot be represented by BSON as it does not fit into uint64"), &j));
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I read the coverage information correctly, then this line is not covered in a test. Please check and add a test.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can only happen if number_unsigned has a larger size than 64 bits. Does the library support that? If not, then this can't be hit.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So far, I failed to compile the library with 128-bit integers. For CBOR, we assume all unsigned integers to fit into 64 bits, so I think it's fair to do the same here. (Any code like the one above could not be tested anyway.)

}
}

Expand Down
22 changes: 11 additions & 11 deletions single_include/nlohmann/json.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -10000,17 +10000,18 @@ class binary_reader
return get_number<std::int32_t, true>(input_format_t::bson, value) && sax->number_integer(value);
}

case 0x11: // uint64
{
std::uint64_t value{};
return get_number<std::uint64_t, true>(input_format_t::bson, value) && sax->number_unsigned(value);
}

case 0x12: // int64
{
std::int64_t value{};
return get_number<std::int64_t, true>(input_format_t::bson, value) && sax->number_integer(value);
}

case 0x11: // uint64
{
std::uint64_t value{};
return get_number<std::uint64_t, true>(input_format_t::bson, value) && sax->number_unsigned(value);
}

default: // anything else not supported (yet)
{
Expand Down Expand Up @@ -16718,7 +16719,7 @@ class binary_writer
{
return (value <= static_cast<std::uint64_t>((std::numeric_limits<std::int32_t>::max)()))
? sizeof(std::int32_t)
: sizeof(std::int64_t);
: sizeof(std::uint64_t);
}

/*!
Expand All @@ -16732,15 +16733,14 @@ class binary_writer
write_bson_entry_header(name, 0x10 /* int32 */);
write_number<std::int32_t>(static_cast<std::int32_t>(j.m_data.m_value.number_unsigned), true);
}
else if (j.m_data.m_value.number_unsigned <= static_cast<std::uint64_t>((std::numeric_limits<std::int64_t>::max)()))
else if (j.m_data.m_value.number_unsigned <= std::numeric_limits<std::uint64_t>::max())
{
write_bson_entry_header(name, 0x12 /* int64 */);
write_number<std::int64_t>(static_cast<std::int64_t>(j.m_data.m_value.number_unsigned), true);
write_bson_entry_header(name, 0x11 /* uint64 */);
write_number<std::uint64_t>(static_cast<std::uint64_t>(j.m_data.m_value.number_unsigned), true);
}
else
{
write_bson_entry_header(name, 0x11 /* uint64 */);
write_number<std::uint64_t>(static_cast<std::uint64_t>(j.m_data.m_value.number_unsigned), true);
JSON_THROW(out_of_range::create(407, concat("unsigned integer number ", std::to_string(j.m_data.m_value.number_unsigned), " cannot be represented by BSON as it does not fit into uint64"), &j));
}
}

Expand Down
4 changes: 2 additions & 2 deletions tests/src/unit-bson.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -339,7 +339,7 @@ TEST_CASE("BSON")
std::vector<std::uint8_t> const expected =
{
0x14, 0x00, 0x00, 0x00, // size (little endian)
0x12, /// entry: int64
0x11, /// entry: uint64
'e', 'n', 't', 'r', 'y', '\x00',
0x01, 0x02, 0x03, 0x04, 0x78, 0x56, 0x34, 0x12,
0x00 // end marker
Expand Down Expand Up @@ -1132,7 +1132,7 @@ TEST_CASE("BSON numerical data")
std::vector<std::uint8_t> const expected_bson =
{
0x14u, 0x00u, 0x00u, 0x00u, // size (little endian)
0x12u, /// entry: int64
0x11u, /// entry: uint64
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a consequence of the change in write_bson_unsigned above.

'e', 'n', 't', 'r', 'y', '\x00',
static_cast<std::uint8_t>((iu >> (8u * 0u)) & 0xffu),
static_cast<std::uint8_t>((iu >> (8u * 1u)) & 0xffu),
Expand Down
Loading