-
Notifications
You must be signed in to change notification settings - Fork 244
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add 3.5.1-SNAPSHOT Shim #9962
Add 3.5.1-SNAPSHOT Shim #9962
Conversation
Signed-off-by: Raza Jafri <[email protected]>
I'm assuming your decimal multiple is related to #9859... If so pleas emake sure it fixes it all the way or we comment on that issue. the shim is very hard to read, one calls mul128 the other calls multiply128. I haven't went and looked at those but one its hard to even see that diff so you should in the very least add a comment or point to issue and explain. |
I will go ahead and put in some comments to highlight the change |
Discussed this offline. I missed the division bit of the puzzle. Will verify division and post an update here |
sql-plugin/src/main/spark311/scala/com/nvidia/spark/rapids/shims/DecimalUtilShims.scala
Outdated
Show resolved
Hide resolved
I have verified the Decimal division and we match Spark 3.5.1 output. It turns out that we were always doing the right thing on the GPU for decimal division. So to match Spark bug for bug we should "fix" the versions Databricks 330+ and Spark versions 340+ by returning the bad answer. I have created an issue for that here |
build |
premerge failing due to an unrelated change
|
build |
build |
build |
build |
build |
This reverts commit 533504f.
I have reverted the tests for versions that we don't support yet. They will be added in other shims |
build |
@andygrove can you PTAL? |
throw RapidsErrorUtils. | ||
arithmeticOverflowError("One or more rows overflow for Add operation.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let us leave formatting-only changes to dedicated PRs
withResource(actualSize) { _ => | ||
val mergedEquals = withResource(start.equalTo(stop)) { equals => | ||
if (step.hasNulls) { | ||
// Also set the row to null where step is null. | ||
equals.mergeAndSetValidity(BinaryOp.BITWISE_AND, equals, step) | ||
} else { | ||
equals.incRefCount() | ||
} | ||
} | ||
withResource(mergedEquals) { _ => | ||
mergedEquals.ifElse(one, actualSize) | ||
} | ||
} | ||
} | ||
|
||
withResource(sizeAsLong) { _ => | ||
// check max size | ||
withResource(Scalar.fromInt(MAX_ROUNDED_ARRAY_LENGTH)) { maxLen => | ||
withResource(sizeAsLong.lessOrEqualTo(maxLen)) { allValid => | ||
require(isAllValidTrue(allValid), | ||
s"Too long sequence found. Should be <= $MAX_ROUNDED_ARRAY_LENGTH") | ||
} | ||
} | ||
// cast to int and return | ||
sizeAsLong.castTo(DType.INT32) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The bottom portion L85-L111 in 311 and L98-L126 in 351 differ only in the require message let us refactor to minimize shimming
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file should be dropped thanks to #9902
Thanks for taking a look @gerashegalov PTAL |
build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Will we plan to run nightly integration tests against spark-3.5.1-SNAPSHOT? |
Yes, we do |
This PR adds shims for Spark 3.5.1-SNAPSHOT.
Changes Made:
Shimplify
command was runThe only files that were manually changed were
pom.xml
andShimServiceProvider.scala
to add the SNAPSHOT version to theVERSIONNAMES
. Also removed some empty lines as a result of the aboveShimplify
commandDecimalUtilShims.scala
which calls the respective multiplication method depending on the Spark version. In Spark 3.5.1 and other versions, the multiplication doesn't perform an interim cast and as part of spark-rapids-jni PR another method calledmul128
was added which skips the interim cast.ComputeSequenceSize.scala
to provide a shim for the new method to calculate sequence size and to make sure it's within limit.GpuBatchScanExec
to match the changes in SparkTests:
All integration tests were run locally
fixes #9258
fixes #9859
fixes #9875
fixes #9743