Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix JoinDataSource to not ignore DataSource layers silently #17726

Merged
merged 145 commits into from
Feb 21, 2025

Conversation

kgyrtkirk
Copy link
Member

  • JoinDataSource uses DataSourceAnalysis as a model - but during construction it ignores some other typed DataSource-s
  • there was a hack to possibly make those work - but it could have been easily happened that it just forgot to call the createSegmentMapFunction on those sources
  • at the same time DataSourceAnalysis also acts as a vertex-boundary like object; which makes this a bit more complicated.

This change is limited to fix the bug: I think in a followup will be needed to create a separate object which could be used inside the join - and a separate one to work with the boundary - the DataSource#getAnalysis was being used at a lot of places so the bugfix was separated for now.

- Rename createReport() to createModesReport() to reflect the updated functionality.
- Update getAnnotation() to getModesAnnotation() to align with the new annotation type.
- Add a new test method createQuidemReasonReport() to handle DecoupledTestConfig annotations.
- Update getQuidemReasonAnnotations() to handle DecoupledTestConfig annotations.
- Update createReport() to handle both NotYetSupported and DecoupledTestConfig annotations.

This commit introduces changes to the test methods and annotations to improve clarity and consistency in the test suite.
This reverts commit 173a057.
This reverts commit 293a483.
This reverts commit c236af7.
@@ -107,13 +109,23 @@
return getQuerySegmentSpecForLookUp(this).lookup(this, walker);
}

public DataSourceAnalysis getDataSourceAnalysis()

Check notice

Code scanning / CodeQL

Missing Override annotation

This method overrides [Query<T>.getDataSourceAnalysis](1); it is advisable to add an Override annotation.
@kgyrtkirk kgyrtkirk marked this pull request as ready for review February 19, 2025 10:12
throw new ISE("Unable to apply specific segments to query with dataSource[%s]", query.getDataSource());
if (!analysis.getEffectiveQuerySegmentSpec().equals(new MultipleSpecificSegmentSpec(descriptors))) {
// If you see the error message below, it's a bug in either this function or in DataSourceAnalysis.
throw new ISE("Unable to apply specific segments to query with dataSource[%s]", query.getDataSource());
Copy link
Contributor

@adarshsanjeev adarshsanjeev Feb 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could change this to a DruidException.defensive

@@ -161,21 +161,18 @@ public static <T> Query<T> withSpecificSegments(final Query<T> query, final List
retVal = query.withDataSource(new QueryDataSource(withSpecificSegments(subQuery, descriptors)));
} else {
retVal = query.withQuerySegmentSpec(new MultipleSpecificSegmentSpec(descriptors));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was the else block being closed here earlier a bug?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Earlier it was able to traverse even QueryDataSource-es ; so it was available more widely.

The change here is that the getEffectiveQuery means The applicable {@link QuerySegmentSpec} for this vertex. ; so it only considers queries belonging to the current vertex.

The other interpretation also has a valuable meaning ; but I've not seen any need for it.

)
);
final Function<SegmentReference, SegmentReference> baseMapFn = joinAnalysis.getBaseDataSource().createSegmentMapFunction(query);
return baseSegment -> newHashJoinSegment(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe this should be createHashJoinSegment, so that it doesn't look like a constructor call?

@cryptoe cryptoe merged commit 592d2e0 into apache:master Feb 21, 2025
75 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants