Skip to content

Commit

Permalink
Added updates on latest changes around nulls
Browse files Browse the repository at this point in the history
  • Loading branch information
spmallette committed Jan 29, 2025
1 parent 1288b9e commit 61d5b07
Show file tree
Hide file tree
Showing 2 changed files with 138 additions and 5 deletions.
133 changes: 133 additions & 0 deletions book/Section-Beyond-Basic-Queries.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -5404,6 +5404,139 @@ the results of your queries as JSON. Remember that if you do save an entire grap
JSON, unless you specify otherwise, the default format is GraphSON 3.0 with
embedded types.


[[nulls]]
Using null in Gremlin
~~~~~~~~~~~~~~~~~~~~~

Gremlin does allow null values to move through a traversal. Whether or not nulls are
meaningful to you with Gremlin tends to depend on the graph database that you are
using and whether or not it supports storing null values. If the graph does not
support that capability, then you won't encounter a null in your traversal pipeline
or your results unless you introduce the null value yourself. You might do that by
way of a side-effect or some other Gremlin step that allows you to supply a value to
the stream. The following examples demonstrate a few ways that you might choose to
do this:

[source,groovy]
----
g.inject(null)
null
g.V().limit(10).coalesce(has('elev',gt(3000)), constant(null)).fold()
[null,v[2305],null,null,null,null,v[2300],null,v[2308],v[2307]]
----

The prior examples don't showcase any particular common use case and unless the graph
itself supports storing null values there is little foundation for injecting them in
this fashion. For graphs that do support storing null, such as TinkerGraph, you can
treat nulls in much the same manner that you do other values in Gremlin.

NOTE: TinkerGraph is not configured to support null storage by default. You must
provide set the 'gremlin.tinkergraph.allowNullPropertyValues' to true in its
configuration to enable it.

Before we look too closely at how null is used in a graph that supports it, let's
first take a look at what happens with null for graphs that do not. In particular,
we should look at the 'property' step.

[source,groovy]
----
g.addV('airport').property('code',null)
v[41223]
g.V().has('code',null)
// no result
g.V().hasNot('code')
v[41223]
----

NOTE: For better portability, prefer 'drop' when removing properties.

In the prior example, Gremlin semantics expect that calls to 'property' with a null
value assignment will result in the step being ignored if the property does not exist
or removed if it does. The following demonstrates the latter:

[source,groovy]
----
g.addV('airport').property('code','XYZ')
v[41224]
g.V(41224).property('code',null)
v[41224]
g.V().hasNot('code')
v[41224]
----

Now that we've looked at graphs that don't support null, let's look at how Gremlin
behaves for those that do:

[source,groovy]
----
g.addV('airport').property('code',null)
v[41228]
g.V().has('code',null)
v[41228]
g.V().has('code',within('IAD',null)).values('code')
null
IAD
----

As you can see in the prior example, the null value is being stored in and retrieved
from the graph. By electing to use this feature, you now have the additional burden
of accounting for null in your query. Gremlin steps tend to behave in null-safe ways
as shown in the the following examples:

[source,groovy]
----
g.V().has('code',within('IAD',null)).values('code').substring(0,1).fold()
[null,I]
g.V().has('code',within('IAD',null)).values('code').length().fold()
[null,3]
g.V().has('code',within('IAD',null)).values('code').groupCount()
[null:1,IAD:1]
g.V().has('code',within('IAD',null)).values('code').order()
null
IAD
g.V().has('code',within('IAD',null)).values('code').order().by(desc)
IAD
null
----

The choice to use null is often made for you as many graphs do not support storing
nulls at which point injecting them yourself into your queries doesn't tend to
provide much added value. In addition, when the graph you are using does support it,
you should take care in the choice to use it because it will reduce the portability
of your application by limiting your graph choices as you will have to find another
that also supports the feature. On the flip side, if you choose a graph that does not
support null values and use the 'property('key',null)' syntax to remove properties,
then keep in mind that this syntax will behave quite differently if you switch to a
graph that suddenly supports storing nulls!

[[traversal-strategies]]
Understanding TraversalStrategies
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Expand Down
10 changes: 5 additions & 5 deletions book/Section-Moving-Beyond.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -643,8 +643,8 @@ Checking to see if a query returned a result
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

It is often important to know if a query returned a result before trying to reference
it to avoid those pesky Java Null Pointer Exceptions. Without worrying about Java for
a second consider the query below purely from a Gremlin point of view.
it to avoid those pesky Java 'NoSuchElementException' errors. Without worrying about
Java for a second consider the query below purely from a Gremlin point of view.

[source,groovy]
----
Expand All @@ -669,8 +669,8 @@ Long result =
----

On the surface, this looks fine. However, were we to execute this code we would get a
Null Pointer Exception as when we try to call 'next' there is no result to process as
there is no edge between Austin and Sydney and hence no distance value to process.
'NoSuchElementException' as when we try to call 'next' there is no result to process
as there is no edge between Austin and Sydney and hence no distance value to process.

NOTE: The source code in this section comes from the 'GraphSearch2.java' sample
located at https://github.com/krlawrence/graph/tree/main/sample-code/java.
Expand Down Expand Up @@ -717,7 +717,7 @@ Integer d = (Integer)
If the route exists the distance will be found and returned, otherwise a value of
'"-1"' will be returned. This is really using the same concept as the 'toList'
example except in this case we generate the list using the 'fold' step within the
query itself. The 'unfold' will return a result if the list is not null, otherwise
query itself. The 'unfold' will return a result if the list is not empty, otherwise
the constant value will be returned as 'coalesce' returns the first to yield a
result.

Expand Down

0 comments on commit 61d5b07

Please sign in to comment.