Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add support for query that have aggregation but missing group by #124

Merged
merged 9 commits into from
Aug 30, 2022

Conversation

kotharironak
Copy link
Contributor

@kotharironak kotharironak commented Aug 26, 2022

Description

As part of the issue - #123, This PR

  • adds support for a query having aggregation selection, but not having group by

Solution:

  • add a query transformer that removes all the other selections that are non-aggregated

Mongo:

  • Inherently do this and discard the selections

Postgres:

  • As the query fails, we need to pre-process the query

Testing

  • have added both unit + integration test

Checklist:

  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • Any dependent changes have been merged and published in downstream modules

@github-actions

This comment has been minimized.

@codecov
Copy link

codecov bot commented Aug 26, 2022

Codecov Report

Merging #124 (91710ec) into main (a42561f) will increase coverage by 0.26%.
The diff coverage is 92.85%.

@@             Coverage Diff              @@
##               main     #124      +/-   ##
============================================
+ Coverage     77.86%   78.13%   +0.26%     
- Complexity      512      524      +12     
============================================
  Files            87       89       +2     
  Lines          2679     2707      +28     
  Branches        266      268       +2     
============================================
+ Hits           2086     2115      +29     
+ Misses          431      430       -1     
  Partials        162      162              
Flag Coverage Δ
integration 78.13% <92.85%> (+0.26%) ⬆️
unit 47.91% <75.00%> (+0.28%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...query/v1/transformer/PostgresQueryTransformer.java 88.88% <88.88%> (ø)
...transformer/PostgresSelectionQueryTransformer.java 92.30% <92.30%> (ø)
...ore/documentstore/postgres/PostgresCollection.java 73.28% <100.00%> (+0.71%) ⬆️
...ntstore/postgres/query/v1/PostgresQueryParser.java 100.00% <100.00%> (ø)

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

* this transformer removes all the non-aggregated expressions. So, the above query will be transformed
* to:
*
* SELECT COUNT(DISTINCT document->>'quantity' ) AS QTY
Copy link
Contributor

@suresh-prakash suresh-prakash Aug 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if we should go this route. We have never transformed the user-given query to a different non-equivalent query just because some database does not support it. In fact, this transformation alters the input selections and removes some of them. This might result in unexpected/undesired effects in the clients because they are asking for 2 selections, but, we only return 1 silently ignoring the other.

The query transformer in Mongo only builds equivalent queries by modifying the given expressions to other equivalent forms or adds some expressions to support the modification. But, the overall query is transformed to another equivalent query supported by the database. We neither remove anything nor transform to a non-equivalent query. Ideally, for such scenarios, we should fail (even in Mongo if that's not the case today). I suspect, the DB itself is not returning the rows in case of Mongo.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, in Mongo we convert "DISTINCT_COUNT" into "$addToSet" in the "$group" stage and add "$size" in the "$project" stage. But, the query is still equivalent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mongo internally does that and discards them. And, thought of the above path, but have to do this for backward compatibility as currently in the Query API layer, we are not discarding those queries.

e.g Mongo Query for the above sample query

[{"$match": {"$and": [{"price": {"$lte": 10}}, {"item": {"$in": ["Mirror", "Comb", "Shampoo", "Bottle"]}}]}}, 
{"$group": {"qty_count": {"$addToSet": "$quantity"}, "_id": null}}, 
{"$project": {"item": 1, "qty_count": {"$size": "$qty_count"}, "price": 1}}]

Response to the above query from mongo (selections are discarded):

[{"qty_count":3}]

@github-actions

This comment has been minimized.

@kotharironak
Copy link
Contributor Author

@suresh-prakash As discussed,

  • we will go with this stop-gap solution
  • clean all the client
  • revet back both mongo + postgres impl to throw an exception.

I am still testing this one with a custom build, if any issue, will fix it first.

@kotharironak kotharironak merged commit 8b831d6 into main Aug 30, 2022
@kotharironak kotharironak deleted the handle-groupby branch August 30, 2022 09:04
@github-actions
Copy link

Unit Test Results

  16 files  ±0    16 suites  ±0   7s ⏱️ +2s
101 tests +1  101 ✔️ +1  0 💤 ±0  0 ❌ ±0 
251 runs  +3  251 ✔️ +3  0 💤 ±0  0 ❌ ±0 

Results for commit 8b831d6. ± Comparison against base commit a42561f.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants