Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for BSON element 11: uint64 #4380

Open
wants to merge 15 commits into
base: develop
Choose a base branch
from

Conversation

slowriot
Copy link

@slowriot slowriot commented May 24, 2024

This PR adds support for decoding and encoding BSON element 11: 64bit unsigned int.

This resolves #3894.

Changes

  • Decode element 0x11 correctly as uint64_t.
  • Update existing write_bson_unsigned to encode unsigned 64bit values correctly (using 0x11 instead of signed 0x12).
  • Update existing tests to expect support for uint64_t.

Reference

Pull request checklist

  • Changes are described in the pull request, or an existing issue is referenced.
  • The test suite compiles and runs without error.
  • Code coverage is 100%. Test cases can be added by editing the test suite.
  • The source code is amalgamated; that is, after making changes to the sources in the include/nlohmann directory, run make amalgamate to create the single-header files single_include/nlohmann/json.hpp and single_include/nlohmann/json_fwd.hpp. The whole process is described here.

@coveralls
Copy link

Coverage Status

coverage: 99.984% (-0.02%) from 100.0%
when pulling 7c616a2 on slowriot:bson-element-11-uint64
into 8c391e0 on nlohmann:develop.

Copy link
Owner

@nlohmann nlohmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the patience. The code looks good, but the documentation needs adjustment:

@slowriot
Copy link
Author

slowriot commented Nov 7, 2024

Thanks for the patience. The code looks good, but the documentation needs adjustment:

Thanks for the review. Documentation updated in latest. Please let me know if I've missed anything else.

@coveralls
Copy link

coveralls commented Nov 7, 2024

Coverage Status

coverage: 99.624% (-0.02%) from 99.639%
when pulling 36ce984 on slowriot:bson-element-11-uint64
into 2d42229 on nlohmann:develop.

@nlohmann
Copy link
Owner

nlohmann commented Nov 7, 2024

Coverage went down. Can you please check if a test is missing?

}
else
{
JSON_THROW(out_of_range::create(407, concat("integer number ", std::to_string(j.m_data.m_value.number_unsigned), " cannot be represented by BSON as it does not fit int64"), &j));
JSON_THROW(out_of_range::create(407, concat("unsigned integer number ", std::to_string(j.m_data.m_value.number_unsigned), " cannot be represented by BSON as it does not fit into uint64"), &j));
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I read the coverage information correctly, then this line is not covered in a test. Please check and add a test.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can only happen if number_unsigned has a larger size than 64 bits. Does the library support that? If not, then this can't be hit.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So far, I failed to compile the library with 128-bit integers. For CBOR, we assume all unsigned integers to fit into 64 bits, so I think it's fair to do the same here. (Any code like the one above could not be tested anyway.)

@nlohmann
Copy link
Owner

(And also, please update to the latest develop branch so that the CI can be green)

@nlohmann
Copy link
Owner

Btw: have you looked at #4356?

@nlohmann nlohmann added the aspect: binary formats BSON, CBOR, MessagePack, UBJSON label Nov 20, 2024
@slowriot slowriot force-pushed the bson-element-11-uint64 branch from d864d43 to 5d0b29b Compare January 10, 2025 18:33
@slowriot slowriot force-pushed the bson-element-11-uint64 branch from 5d0b29b to d864d43 Compare January 10, 2025 18:38
@nlohmann

This comment was marked as outdated.

# Conflicts:
#	docs/mkdocs/docs/features/binary_formats/bson.md
#	docs/mkdocs/docs/home/exceptions.md
#	include/nlohmann/detail/output/binary_writer.hpp
#	single_include/nlohmann/json.hpp
#	tests/src/unit-bson.cpp
@slowriot slowriot force-pushed the bson-element-11-uint64 branch from d864d43 to df79b3f Compare January 10, 2025 18:43
Copy link

🔴 Amalgamation check failed! 🔴

The source code has not been amalgamated. @slowriot
Please read and follow the Contribution Guidelines.

1 similar comment
Copy link

🔴 Amalgamation check failed! 🔴

The source code has not been amalgamated. @slowriot
Please read and follow the Contribution Guidelines.

@slowriot
Copy link
Author

We recently merged #4590. I think this PR can be closed.

Thanks, just catching up with the changes since I last pushed to this PR - took a moment to resolve the conflicts as #4590 duplicated most of what I did, but it seems there may still have been a few changes in this PR that were missed in the other (unit test for example, and slightly different approach to choosing a format when encoding, documentation grammar, case ordering and a few other minor details).

Please review and let me know if any of it is still of use, and if it can be merged on top or if anything needs to be corrected or removed from this change. Otherwise of course it can be closed.

@nlohmann
Copy link
Owner

Sure, go ahead.

@slowriot
Copy link
Author

@nlohmann Having rebased, I think the remaining changes in this PR's diffs may still be useful to merge - please have a look. Otherwise, please close.

@nlohmann
Copy link
Owner

@codenut Can you please have a look at the remaining diff?

Copy link
Owner

@nlohmann nlohmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The remaining changes seem reasonable, but I would also like to get @codenut 's take on this.

  • Please revert the whitespace changes in the documentation. I think there is only one grammar fix left then.
  • There is an uncovered branch that can be removed as it would only be triggered with 128-bit integers.
  • A call to max needs to be put in parentheses to fix the Windows builds.

@@ -1076,7 +1076,7 @@ class binary_writer
{
return (value <= static_cast<std::uint64_t>((std::numeric_limits<std::int32_t>::max)()))
? sizeof(std::int32_t)
: sizeof(std::int64_t);
: sizeof(std::uint64_t);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change seems reasonable.

}
else
{
JSON_THROW(out_of_range::create(407, concat("integer number ", std::to_string(j.m_data.m_value.number_unsigned), " cannot be represented by BSON as it does not fit int64"), &j));
JSON_THROW(out_of_range::create(407, concat("unsigned integer number ", std::to_string(j.m_data.m_value.number_unsigned), " cannot be represented by BSON as it does not fit into uint64"), &j));
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So far, I failed to compile the library with 128-bit integers. For CBOR, we assume all unsigned integers to fit into 64 bits, so I think it's fair to do the same here. (Any code like the one above could not be tested anyway.)

{
write_bson_entry_header(name, 0x12 /* int64 */);
write_number<std::int64_t>(static_cast<std::int64_t>(j.m_data.m_value.number_unsigned), true);
write_bson_entry_header(name, 0x11 /* uint64 */);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change seems reasonable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is logical, but I have some concerns about backward compatibility. It may break the following scenario:

void write()
{
    const uint64_t = 9223372036854775807L;
    json const j = {
            {"entry", l}
    };
    const std::vector<uint8_t> bson = json::to_bson(j);
    saveToTable1(bson);
    saveToTable2(bson);
}

write(); // was called before the code changes in this PR
void read()
{
    const std::vector<uint8_t> bson1 = loadFromTable1();
    json const j = json::from_bson(bson);
    const std::vector<uint8_t> bson1_roundtrip = json::to_bson(j);

    const std::vector<uint8_t> bson2 = loadFromTable2();

    if (equals(bson1_roundtrip, bson2)) { 
        ...
    }
}

read(); 

With the changes in this PR, the comparison between bson1_roundtrip and bson2 will fail. It is hard to tell if clients rely on this behavior, but I would like to highlight this potential issue.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For compatibility, we could serialize unsigned integers until int64_max with 0x12 and all numbers larger with 0x11.

@@ -1132,7 +1132,7 @@ TEST_CASE("BSON numerical data")
std::vector<std::uint8_t> const expected_bson =
{
0x14u, 0x00u, 0x00u, 0x00u, // size (little endian)
0x12u, /// entry: int64
0x11u, /// entry: uint64
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a consequence of the change in write_bson_unsigned above.

include/nlohmann/detail/output/binary_writer.hpp Outdated Show resolved Hide resolved
@slowriot
Copy link
Author

The remaining changes seem reasonable, but I would also like to get @codenut 's take on this.

* Please revert the whitespace changes in the documentation. I think there is only one grammar fix left then.

* There is an uncovered branch that can be removed as it would only be triggered with 128-bit integers.

* A call to `max` needs to be put in parentheses to fix the Windows builds.

Thanks for reviewing. I've merged in the latest develop branch which already includes equivalent changes for a lot of the original PR, and tidied up the remaining diffs. Please let me know if there's anything outstanding to address.

@nlohmann
Copy link
Owner

Please check https://github.com/nlohmann/json/pull/4380/checks?check_run_id=35473431508on how to fix the DCO warning.

@nlohmann
Copy link
Owner

(The coverage job is broken also in other branches. I am looking into this)

@nlohmann
Copy link
Owner

The coverage job is now fixed in the develop branch. Please update.

@nlohmann nlohmann added the please rebase Please rebase your branch to origin/develop label Jan 15, 2025
{
write_bson_entry_header(name, 0x12 /* int64 */);
write_number<std::int64_t>(static_cast<std::int64_t>(j.m_data.m_value.number_unsigned), true);
write_bson_entry_header(name, 0x11 /* uint64 */);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For compatibility, we could serialize unsigned integers until int64_max with 0x12 and all numbers larger with 0x11.

@nlohmann
Copy link
Owner

Please update to the latest develop branch and check https://github.com/nlohmann/json/pull/4380/checks?check_run_id=35473431508 on how to fix the DCO warning.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
aspect: binary formats BSON, CBOR, MessagePack, UBJSON documentation M please rebase Please rebase your branch to origin/develop tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

json::to_bson does not handle unsigned 64 bit integers
5 participants