-
Notifications
You must be signed in to change notification settings - Fork 151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make OPTIONAL MATCH and WHERE interaction easier to understand #191
Comments
+1 on this CIR. Although Idea 1 has the advantage of being far less drastic in terms of existing usage, I must say the synergy of Idea 2 and subqueries is very compelling indeed... |
+1 for Idea 2, I like the clarity this provides. |
I would add Idea 0: "Take no action" - as the semantic of WHERE tied to OPTIONAL MATCH is |
I stumbled upon this problem recently and just found this issue. I ran a couple of experiments to better understand the semantics of I used the following toy graph:
Matching on a single edge is fairly simple:
Things get more complicated with two edges, especially for the query plan, which contains an Apply and an Optional operator.
Splitting the
Note that the TCK currently does not fully cover this issue. The only that have two edges in an $ ag 'OPTIONAL MATCH.*\(.*\(.*\(' -C 2
MatchAcceptance2.feature
316- """
317- MATCH (a {name: 'A'})
318: OPTIONAL MATCH (a)-[:KNOWS]->()-[:KNOWS]->(foo)
319- RETURN foo
320- """
--
354- """
355- MATCH (a {name: 'A'})
356: OPTIONAL MATCH p = (a)-->(b)-[*]->(c)
357- RETURN p
358- """
OptionalMatchAcceptance.feature
258- """
259- MATCH (a:Single), (c:C)
260: OPTIONAL MATCH (a)-->(b)-->(c)
261- RETURN b
262- """
--
270- """
271- MATCH (a:A), (c:C)
272: OPTIONAL MATCH (a)-->(b)-->(c)
273- RETURN b
274- """ We are happy to write some additional TCK tests if you agree that they'd be useful. The queries above might provide a good starting point for that. Update: I checked the SIGMOD 2018 Cypher paper on the matter: This, as expected, captures the semantics of the language correctly, but is rather difficult (but not impossible) to translate to standard relational algebra. |
@szarnyasg Thanks for raising this, and please do write the additional TCK tests for this (much appreciated). |
@szarnyasg Interesting analysis, and yes we would of course be interested in such a TCK contribution (as Petra already mentions). You write that the TCK 'does not fully cover this issue' -- I wonder to which issue you are referring? The general issue to which this CIR is dedicated (OPT MATCH + WHERE giving allegedly surprising behaviour), or another issue in particular? Are you perhaps looking for being able to express predicates covering multiple |
@Mats-SX I was referring to the $ ag 'OPTIONAL MATCH.*\n.*OPTIONAL MATCH' -C 2
MatchAcceptance2.feature
1646- """
1647- MATCH (a:A)
1648: OPTIONAL MATCH (a)-[:FOO]->(b:B)
1649: OPTIONAL MATCH (b)<-[:BAR*]-(c:B)
1650- RETURN a, b, c
1651- """
OptionalMatchAcceptance.feature
97- """
98- MATCH (a:Single)
99: OPTIONAL MATCH (a)-->(b:NonExistent)
100: OPTIONAL MATCH (a)-->(c:NonExistent)
101- WITH coalesce(b, c) AS x
102- MATCH (x)-->(d)
--
282- """
283- MATCH (a:A), (b:B)
284: OPTIONAL MATCH (a)-->(x)
285: OPTIONAL MATCH (x)-[r]->(b)
286- RETURN x, r
287- """
--
309- When executing query:
310- """
311: OPTIONAL MATCH (a:NotThere)
312: OPTIONAL MATCH (b:NotThere)
313- WITH a, b
314- OPTIONAL MATCH (b)-[r:NOR_THIS]->(a)
--
329- When executing query:
330- """
331: OPTIONAL MATCH (f:DoesExist)
332: OPTIONAL MATCH (n:DoesNotExist)
333- RETURN collect(DISTINCT n.property) AS a, collect(DISTINCT f.property) AS b
334- """ |
So the "issue" is the low level of TCK coverage for the feature? Then I understand. |
I think our "issue" was that |
I have checked queries you had posted - the results are expectedly accordant with Cypher's semantic, eval_clause(Match,Graph,BindingTable,MatchBindingTable,Graph) :-
Match = match(MatchModifier,Pattern,Where),
findall(MatchBindingRow,
(
member(BindingRow,BindingTable),
eval_match_(MatchModifier,Graph,Pattern,Where,BindingRow,MatchBindingRow)
),
MatchBindingTable).
eval_match_(no_modifier,Graph,Pattern,Where,BindingRow,MatchBindingRow)
:-
eval_match(BindingRow,Graph,Pattern,MatchBindingRow),
eval_where(Graph,Where,MatchBindingRow).
eval_match_(optional,Graph,Pattern,Where,BindingRow,MatchBindingRow)
:-
findall(MatchBindingRow_,
(
eval_match(BindingRow,Graph,Pattern,MatchBindingRow_),
eval_where(Graph,Where,MatchBindingRow_)
),
MatchBindingRows_),
MatchBindingRows_ = [_|_],
!,
member(MatchBindingRow,MatchBindingRows_).
eval_match_(optional,_Graph,Pattern,_Where,BindingRow,MatchBindingRow)
:-
get_names(Pattern,Names),
list_to_set(Names,NamesSet),
maplist([Name,bind(Name,cypher_null)]>>true,NamesSet,Binds),
extend_binding_row(BindingRow,Binds,MatchBindingRow). If you wish you can verify your respective query results in Cypher.PL
Cypher.PL console is available at: |
Thanks, @JanekPo for investigating this with your implementation. Neo4j returns the correct results indeed, but it is worth trying to make the interaction of |
Add tests for interactions of OPTIONAL MATCH and WHERE #191
CIR-2017-191
The fact that
WHERE
does not filter rows when part of anOPTIONAL MATCH
has confused many users. Semantically,OPTIONAL MATCH
behaves like an outer join, and will only ever be additive (>= 0) to the cardinality of the preceding query. Consider the below example query, and note the comment:In contrast, moving the
WHERE
subclause to be attached to theMATCH
clause, the query will return an empty result:It has not been clear to users that
WHERE
is intimately attached to its preceding clause, and that it only affects the cardinality of that clause, rather than the whole query (although this difference does not exist in the cases ofWITH
andMATCH
). This CIR requests some way of making this more clear to the user.Idea 1
Prohibit
WHERE
subclauses that do not depend on variables bound by its superclause. This would make the example queries in the description of this CIR invalid.This idea could be 'soft', and allow predicate expressions that partially depend on the bound variables:
Or it could be 'strict', and only allow predicate expressions without constant parts.
Idea 2
Replace the
OPTIONAL MATCH
clause with a subquery clause;OPTIONAL
. The first query from the description would be expressed like so:Ideas 1 and 2 are not mutually exclusive.
The text was updated successfully, but these errors were encountered: