-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to write double/f64 on float columns using the Storage Write API #106
Comments
I didn’t implement this feature directly; it was a contribution. @imor, do you have any idea why the f64 isn’t working? |
@fungiboletus let me take a look. |
Hey @fungiboletus, I've started a fix in the draft PR #112. One thing which I found a bit weird is that the f32 types appear to have some precision loss in BigQuery. E.g. when I inserted an f32 with value 1.2, it appeared as 1.2000000476837158. I've observed the exact same behaviour with the official python library as well, so not sure if it's a bug in our implementation or something wrong with BigQuery. Can you test a bit with the branch in the PR and share your results. I'm also looking if we need to add support for other types as well and will include any other missing types. |
@imor this isn't loss of precision in the traditional sense, but rather just that 1.2 can't be exactly represented by IEEE 754 64 bit floats. But it's not BigQuery specific, just casting an f32 to an f64 in pure rust does the same: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=469b96b4a3165289bef2bfd75a91a55d And given that the BigQuery only has a |
@fore5fire thanks for the comment. Makes sense. I suspected this but then was confused by the fact that when I was inserting into BigQuery using SQL, 1.2 was shown in exact representation. I think that could be due to SQL converting the literal value 1.2 directly into |
I spent some time on this today. The immediate cause of the reported issue was that the Prost struct serialized the
As in the current draft PR, it can be corrected by mapping a This analysis also exposed another weakness of the current design: specifying the same information twice. Once as prost annotations on the struct and the second time as a # define protobuf schema once in the class
class Actor(proto.Message):
actor_id = proto.Field(proto.INT64, number=1)
float = proto.Field(proto.FLOAT, number=2)
double = proto.Field(proto.DOUBLE, number=3)
proto_descriptor = DescriptorProto()
# then copy it to `proto_descriptor`
Actor.pb().DESCRIPTOR.CopyToProto(proto_descriptor) Doing something similar in Rust means updating the prost's Edit: the above is unfortunately a breaking change, but that is the correct way to fix the issue. The precision loss issue is also confirmed to be due to conversion from an |
Thanks a lot for your work on this issue. I remember being slightly confused about the various mappings so I simply copy pasted the examples and tried all permutations to make f64 work, hoping for the best 😊 I think that specifying the information twice is an acceptable tradeoff. Do you still want me to test the PR or should I wait a bit for the breaking change to happen? |
@fungiboletus no testing needed for now, I've marked the PR ready to merge. You can use PR's branch if you want the code with the fix. Note that it has breaking changes with older code so minor adjustments are expected in your code. |
@imor Is there a specific reason besides the additional work required in adding the nested descriptors that you're not adding Enum and Message types? I'm working on a PR which adds support for these, and it would be helpful to know of any specific blockers you had in mind |
I thought there were no corresponding BigQuery types for these. I might be mistaken. I guess Enums can be mapped to integers (or strings) maybe? I'm not certain. If you have ideas about how to add support for this I'd love a PR for this work. |
Good to hear there's not some big blocker I wasn't aware of. Per the docs, enums map to strings and messages/structs map to records. I'll incorporate your type updates and open a PR once I test it a bit in my project |
A correction regarding this statement I made. An fn main() {
let f: f32 = 1.2;
println!("{:.32}", f);
println!("{:.32}", f as f64);
} The above print 32 digits after the decimal point resulting in the following output:
|
Hi,
I am testing the new Storage Write API and I encounter some issues with
FLOAT/FLOAT64
columns. As I understood, BigQuery only has 64 bits floats columns, but I only manage to write 32 bits floats from Rust on those columns.On version
0.22.0
,double/f64
fails silently, withNULL
values being inserted instead.My table schema looks like this:
My field descriptors like this:
My Prost struct like this:
the String, f32, and numeric values work well though.
Have you experienced the same?
The text was updated successfully, but these errors were encountered: