Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix expression post aggregator array handling when grouping wrapper types leak #15543

Conversation

clintropolis
Copy link
Member

Description

Fixes an issue which can occur when using ExpressionPostAggregator on on array types output from grouping engine in certain situations such as inline data. The grouping engine uses wrapper types for arrays, ComparableStringArray and ComparableList so that they implement Comparable (current requirement for all grouping dimension types). The native expression system did not recognize these types, leading to them being treated as COMPLEX rather than ARRAY types, which made stuff sad.

This also made an adjustment to FunctionalExpr which after #14987 was causing things like cast exceptions to eventually end up going through DruidException.defensive and presented as unknown server-side errors. While in this case the class cast exception really was a bug, in most cases it would not be, and so these are now handled as user errors so that the exception message is now shown to the user.


This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • a release note entry in the PR description.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in licenses.yaml
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • added integration tests.
  • been tested in a test Druid cluster.

@@ -207,7 +207,7 @@
break;
}
if (!ExpressionProcessing.useStrictBooleans() && !type.is(ExprType.STRING) && !type.isArray()) {
return ExprEval.ofBoolean(result, type.getType());
return ExprEval.ofBoolean(result, type);

Check notice

Code scanning / CodeQL

Deprecated method or constructor invocation Note

Invoking
ExprEval.ofBoolean
should be avoided because it has been deprecated.
@@ -184,7 +184,7 @@
if (!ExpressionProcessing.useStrictBooleans()) {
// conforming to other boolean-returning binary operators
ExpressionType retType = ret.type().is(ExprType.DOUBLE) ? ExpressionType.DOUBLE : ExpressionType.LONG;
return ExprEval.ofBoolean(!ret.asBoolean(), retType.getType());
return ExprEval.ofBoolean(!ret.asBoolean(), retType);

Check notice

Code scanning / CodeQL

Deprecated method or constructor invocation Note

Invoking
ExprEval.ofBoolean
should be avoided because it has been deprecated.
Comment on lines +1292 to +1307
() -> testHelper.testExpressionString(
roundFunction,
ImmutableList.of(
DruidExpression.ofColumn(ColumnType.FLOAT, "x"),
DruidExpression.ofStringLiteral("foo")
)
),
"IAE Exception"
testHelper.makeInputRef("x"),
testHelper.makeLiteral("foo")
),
DruidExpression.ofExpression(
ColumnType.FLOAT,
DruidExpression.functionCall("round"),
ImmutableList.of(
DruidExpression.ofColumn(ColumnType.FLOAT, "x"),
DruidExpression.ofStringLiteral("foo")
)
),
"IAE Exception"
)

Check notice

Code scanning / CodeQL

Deprecated method or constructor invocation Note test

Invoking
ExpressionTestHelper.testExpressionString
should be avoided because it has been deprecated.
Comment on lines +2325 to +2333
() -> testHelper.testExpressionString(
new RightOperatorConversion().calciteOperator(),
ImmutableList.of(
testHelper.makeInputRef("s"),
testHelper.makeLiteral(-1)
),
makeExpression("right(\"s\",-1)"),
null
)

Check notice

Code scanning / CodeQL

Deprecated method or constructor invocation Note test

Invoking
ExpressionTestHelper.testExpressionString
should be avoided because it has been deprecated.
testHelper.makeInputRef("s"),
testHelper.makeLiteral(-1)
),
makeExpression("right(\"s\",-1)"),

Check notice

Code scanning / CodeQL

Deprecated method or constructor invocation Note test

Invoking
CalciteTestBase.makeExpression
should be avoided because it has been deprecated.
testHelper.makeInputRef("s"),
testHelper.makeLiteral(-1)
),
makeExpression("left(\"s\",-1)"),

Check notice

Code scanning / CodeQL

Deprecated method or constructor invocation Note test

Invoking
CalciteTestBase.makeExpression
should be avoided because it has been deprecated.
Comment on lines +2436 to +2444
() -> testHelper.testExpressionString(
new LeftOperatorConversion().calciteOperator(),
ImmutableList.of(
testHelper.makeInputRef("s"),
testHelper.makeInputRef("s")
),
makeExpression("left(\"s\",\"s\")"),
null
)

Check notice

Code scanning / CodeQL

Deprecated method or constructor invocation Note test

Invoking
ExpressionTestHelper.testExpressionString
should be avoided because it has been deprecated.
testHelper.makeInputRef("s"),
testHelper.makeInputRef("s")
),
makeExpression("left(\"s\",\"s\")"),

Check notice

Code scanning / CodeQL

Deprecated method or constructor invocation Note test

Invoking
CalciteTestBase.makeExpression
should be avoided because it has been deprecated.
Comment on lines +2491 to +2499
() -> testHelper.testExpressionString(
new RepeatOperatorConversion().calciteOperator(),
ImmutableList.of(
testHelper.makeInputRef("s"),
testHelper.makeInputRef("s")
),
makeExpression("repeat(\"s\",\"s\")"),
null
)

Check notice

Code scanning / CodeQL

Deprecated method or constructor invocation Note test

Invoking
ExpressionTestHelper.testExpressionString
should be avoided because it has been deprecated.
testHelper.makeInputRef("s"),
testHelper.makeInputRef("s")
),
makeExpression("repeat(\"s\",\"s\")"),

Check notice

Code scanning / CodeQL

Deprecated method or constructor invocation Note test

Invoking
CalciteTestBase.makeExpression
should be avoided because it has been deprecated.
@vogievetsky
Copy link
Contributor

I did a build of this and I can confirm that this fixes the error that I brought up.

Comment on lines 196 to 203
throw DruidException.forPersona(DruidException.Persona.USER)
.ofCategory(DruidException.Category.INVALID_INPUT)
.build(e.getMessage());
}
catch (Types.InvalidCastException | Types.InvalidCastBooleanException e) {
throw DruidException.forPersona(DruidException.Persona.USER)
.ofCategory(DruidException.Category.INVALID_INPUT)
.build("Function[%s] encountered exception: %s", name, e.getMessage());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will throw away the stacktrace - the old version was retaining that

The handler part here is for when really unexpected stuff happens...
I think:

  • ExpressionValidationException should not be handled
  • InvalidCastException should be a DruidException and done...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will throw away the stacktrace - the old version was retaining that
I don't think the stack trace is particularly useful for either of these since the exception message clearly says the problem and they are not caused by bugs, but rather bad user input. defensive is expected to be a bug, so the stack trace is a lot more important.

ExpressionValidationException should not be handled
I handled this here because it should be a user exception and have a 400 response, it results from the user writing the function incorrectly. If not handled here, it needs to be handled somewhere else to make it a user exception.

InvalidCastException should be a DruidException and done...
The problem with making it a DruidException when it is thrown is that where it is thrown from we don't have the context of function name, so catching it to decorate with function name isn't particularly straightforward at the moment afaik. This is also a user exception and should have a 400 response.

I'm still not completely convinced that expressions should be throwing DruidException... While prependAndBuild would partially solve these needs, it also reduces the exception cause to basically a string message, and all we can really do with it is decorate it with more strings. I'm unsure if this is too limiting, still thinking a bit about it, but didn't really want to change too much unrelated stuff in this PR. Currently afaik this is the only place in native expressions throw DruidException, and it only is handled for non-vectorized expressions. Vectorized expression processing for functions just make a ExprVectorProcessor like all other expressions, so we can't actually handle this generically in a centralized place, and ApplyFunctionExpr doesn't have this same handling that FunctionExpr has, so I find things are in a bit of a strange inconsistent state.

Ideally, we should catch all exception from outside of expressions from where they are called, like selectors and vector selectors of ExpressionVirtualColumn, ExpressionFilter, ExpressionLambdaAggregatorFactory aggs, and ExpressionPostAggregator, so that they could add the additional context of like what the expression belongs to. The question I have is if DruidException is enough, or if we should have native expression exceptions that the things that catch them know how to appropriately translate them into DruidException, but I don't really want to address that in this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay - I don't have any big expectation from these things ; just 1: if there was a parent exception it should be set for the new exception ; please restore that behavior somehow - if the exception will get logged for some reason that part must also be logged; this is also usefull when a test fails for whatever reason: having the full stacktrace will make it possible to navigate to the place where the issue happened.

I think DruidException is designed oddly - but its still possible to pass the parent exception to the build method (I think the other method should be dropped - so that when it gets invoked the developer must fill it to null to get that behaviour - which is less errorprone)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated to retain cause exception for the user exceptions

@@ -141,4 +141,20 @@ public IncompatibleTypeException(TypeSignature<?> type, TypeSignature<?> other)
super("Cannot implicitly cast [%s] to [%s]", type, other);
}
}

public static class InvalidCastException extends IAE
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use DruidException ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see other comment

.ofCategory(DruidException.Category.INVALID_INPUT)
.build(e.getMessage());
}
catch (Types.InvalidCastException | Types.InvalidCastBooleanException e) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this should be so specific? I think IAE should be allowed with a plain rethrow

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see other comment about translating into invalid input/user exception so it makes the appropriate type of error response and not a system error

@@ -190,11 +191,22 @@ public ExprEval eval(ObjectBinding bindings)
try {
return function.apply(args, bindings);
}
catch (DruidException | ExpressionValidationException e) {
catch (ExpressionValidationException e) {
// ExpressionValidationException already contain function name
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if that's true; why catch it at all?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment about making invalid input/user error short of catching it somewhere else

@@ -357,17 +360,17 @@ public static ExprEval ofArray(ExpressionType outputType, @Nullable Object[] val
* instead.
*/
@Deprecated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this method really @Deprecated ? doesn't looks like it..

Copy link
Member Author

@clintropolis clintropolis Dec 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is only used in 2 places, and ideally nothing new should use it, it is for the legacy boolean handling mode (strictBooleans = false)

Copy link
Member

@kgyrtkirk kgyrtkirk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lots of assertThrows changes :D
I wonder if there are any tools which could help to make it easier to do changes like this

@kgyrtkirk
Copy link
Member

actually I was looking around yesterday for some tool which could help with a tricky refactoring step - I bumped into one which looked to much at first glance and I needed a few minutes to remember the name of the project...but it has a recipe to do something similar :D

https://docs.openrewrite.org/recipes/java/testing/junit5/removetrycatchfailblocks

it also have junit4to5 stuff... I'll probably try it out someday :)

Copy link
Contributor

@soumyava soumyava left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@clintropolis clintropolis merged commit e373f62 into apache:master Dec 16, 2023
87 checks passed
@clintropolis clintropolis deleted the fix-leaky-group-by-array-types-postaggs branch December 16, 2023 05:43
@LakshSingla LakshSingla added this to the 29.0.0 milestone Jan 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants