-
Notifications
You must be signed in to change notification settings - Fork 318
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/decimal support #982
base: main
Are you sure you want to change the base?
Conversation
Co-authored-by: Steve Suh <[email protected]>
Can we get some traction on this one? It looks like this PR is ready and we have a lot of code that uses decimals that will be difficult to write tests around without this. |
/// </summary> | ||
/// <param name="s">The stream to write</param> | ||
/// <param name="value">The decimal to write</param> | ||
public static void Write(Stream s, decimal value) => Write(s, value.ToString()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should use ToString(CultureInfo.InvariantCulture) if we are using a string on the wire.
@@ -267,6 +267,9 @@ private ISocketWrapper GetConnection() | |||
case 'd': | |||
returnValue = SerDe.ReadDouble(inputStream); | |||
break; | |||
case 'm': | |||
returnValue = decimal.Parse(SerDe.ReadString(inputStream)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use decimal.Parse(SerDe.ReadString(inputStream), CultureInfo.InvariantCulture) to ensure we are using invariant culture on the wire.
Row row = df.Collect().First(); | ||
Assert.Equal(decimal.MinValue, row[0]); | ||
Assert.Equal(decimal.MaxValue, row[1]); | ||
Assert.Equal(decimal.Zero, row[2]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't gotten to dive deep into whether this is an issue yet, but want to bring it to attention just in case:
There was a time when we were comparing SQL Server output to Spark SQL output trying to migrate a pipeline to Synapse, and when attempting to diff two tables, found an issue with a double
.
SQL Server uses, presumably, C#'s (and JavaScript, which the Python Notebook table preview in Synapse uses)'s conception of floats: -0.0 == 0.0
, but the JVM/Spark in some cases compares by bit and differentiates because of the signed bit: -0.0 != 0.0
.
It's resolved in later versions of Spark's DataFrames, and may not apply in the case of [decimal]String
, so it may not be problematic.
- https://issues.apache.org/jira/browse/SPARK-26021
- https://www.mail-archive.com/[email protected]/msg283973.html
- https://issues.apache.org/jira/browse/SPARK-32110
- [SPARK-32110][SQL] normalize special floating numbers in HyperLogLog++ apache/spark#30673
- [BUG] -0.0 vs 0.0 is a hot mess NVIDIA/spark-rapids#294
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is because internally, BigDecimal uses BigInteger, and BigInteger also only has a single concept of zero. A BigInteger behaves as a two's-complement integer, and two's-complement only has a single zero.
Any updates on this PR? |
Hey @GoEddie , are you still working on this? |
Hi @AFFogarty, I had given up really as no one seemed to be reviewing pr’s but am happy to get it up to date again. ed |
Hi @GoEddie , thanks for the contribution! Are you still working on this? We recently get write permission of this repo and happy to move this forward. We just added support for 3.3 and 3.5. |
We are excited to review your PR.
So we can do the best job, please check:
Fixes #nnnn
in your description to cause GitHub to automatically close the issue(s) when your PR is merged.This implements #818
On the Apache Spark side decimal is implemented using a java.math.BigDecimal and the only way to construct that is using a int/long - there is no way to construct it with any value larger than max long unless you use the string constructor so I pass a string back and forth between .NET and the JVM, hope that is ok.