Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(stdlib): parse_json improved accuracy float parsing #755 #767

Merged
merged 2 commits into from
Apr 1, 2024

Conversation

drmason13
Copy link
Contributor

Fixed by activating a feature in serde_json, see serde-rs/json#536

For anyone somehow relying on the innaccurate/faster parsing of floats, it's technically a breaking change.

Copy link
Contributor

@pront pront left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @drmason13. I wonder if a better approach is to add a flag and preserve backwards compatibility. Also the accurate parser is much slower so we might not want to turn it on unconditionally.

@pront
Copy link
Contributor

pront commented Mar 26, 2024

Also cc @jszwedko

@jszwedko
Copy link
Member

Thanks @drmason13 ! A concern I have is that this apparently comes at 2x overhead cost:

https://github.com/serde-rs/json/blob/b1ebf3888ed728c66c69581ba8d9c4aa7483f486/Cargo.toml#L60-L67

It would be nice to make this an optional feature that could be enabled at runtime but given it is a compiler flag I'm not seeing a way to do that. I'd be curious to see if this shows up in our benchmarks in Vector. I opened a PR to see if it flags any difference:

vectordotdev/vector#20185

@drmason13
Copy link
Contributor Author

The performance implications are a valid concern, interested to see what the benchmarks show.

I think it would get complicated to have this be controlled by a flag. Would you need two versions of the serde_json dependency - one with the feature and one without? I'm not sure how to do this.

@pront
Copy link
Contributor

pront commented Mar 27, 2024

The results here are better than I expected. I am OK with making accurate parsing the default. However, I don't have much time to think about this more deeply. Ultimately I will leave this decision to @jszwedko.

@jszwedko
Copy link
Member

Agreed, the impact on Vector seems fairly negligible, I think likely because I/O dominates the tests rather than the extra CPU spent on encoding/decoding floats.

I would like to see this be an optional feature in the VRL crate, though, that could be enabled by Vector so that other consumers of the VRL crate can decide whether to opt into it. Do you mind refactoring this PR to do that? See https://doc.rust-lang.org/cargo/reference/features.html#dependency-features for what this looks like.

@drmason13
Copy link
Contributor Author

Open to suggestions for the feature name, I've gone with mirroring the serde_json feature name for clarity.

Copy link
Member

@jszwedko jszwedko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @drmason13 ! This seems reasonable to me.

@jszwedko jszwedko requested a review from pront March 29, 2024 13:28
@pront pront added this pull request to the merge queue Apr 1, 2024
Merged via the queue into vectordotdev:main with commit dbb86dc Apr 1, 2024
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants