Need a query to retrieve complete graph

Question

I am trying to retrieve all the node and properties details in parent-child hierarchy. Nested within each other. Since I am new with gremlin, graphDB I am having really tough time to get it done.

Please suggest a solution and if you could walk me through it, it will be great.

Following is my structure

And I am trying to keep the response as clean as possible. I am using cosmosDB and Gremlin. NET api for this.

I tried the following but it gave me response in key value, g.V("some_id").repeat(out()).emit().tree().path() g.V("some_id").emit().repeat(both().simplePath()).dedup()

please any kind of suggestion would be great.

stephen mallette · Accepted Answer · 2022-09-30T12:33:29.450

I"m not sure what format you want your result, but use of path(), tree() or subgraph() would typically give you the graph structure. Since you are using CosmosDB, you're only options are path() and tree() as subgraph() does not appear to be supported.

Using this sample graph as a simple tree:

g.addV().property(id, '1').as('1').
  addV().property(id, '2a').as('2a').
  addV().property(id, '2b').as('2b').
  addV().property(id, '3a').as('3a').
  addV().property(id, '4a').as('4a').
  addE('child').from('1').to('2a').
  addE('child').from('1').to('2b').
  addE('child').from('2a').to('3a').
  addE('child').from('3a').to('4a')

you can see the effect of path() which basically gathers the contents of each step Gremlin took:

gremlin> g.V('1').repeat(out()).emit().path()
==>[v[1],v[2a]]
==>[v[1],v[2b]]
==>[v[1],v[2a],v[3a]]
==>[v[1],v[2a],v[3a],v[4a]]

Since I used out() we don't see the edges, but that is easily remedied by adding making a small adjustment to directly consume edges into the path history:

gremlin> g.V('1').repeat(outE().inV()).emit().path()
==>[v[1],e[0][1-child->2a],v[2a]]
==>[v[1],e[1][1-child->2b],v[2b]]
==>[v[1],e[0][1-child->2a],v[2a],e[2][2a-child->3a],v[3a]]
==>[v[1],e[0][1-child->2a],v[2a],e[2][2a-child->3a],v[3a],e[3][3a-child->4a],v[4a]]

Taken together with duplication removed on your application side you have a complete graph with path().

Replacing path() with tree() will essentially do that deduplication by maintaining the tree structure of the path history:

gremlin> g.V('1').repeat(out()).emit().tree()
==>[v[1]:[v[2b]:[],v[2a]:[v[3a]:[v[4a]:[]]]]]
gremlin> g.V('1').repeat(outE().inV()).emit().tree()
==>[v[1]:[e[0][1-child->2a]:[v[2a]:[e[2][2a-child->3a]:[v[3a]:[e[3][3a-child->4a]:[v[4a]:[]]]]]],e[1][1-child->2b]:[v[2b]:[]]]]

The Tree is just represented as a Map where each key represents a like a root and value is another Tree (i.e. the branches from it). It is perhaps better visualized this way:

gremlin> g.V('1').repeat(out()).emit().tree().unfold()
==>v[1]={v[2b]={}, v[2a]={v[3a]={v[4a]={}}}}
gremlin> g.V('1').repeat(out()).emit().tree().unfold().next().value
==>v[2b]={}
==>v[2a]={v[3a]={v[4a]={}}}

If neither of these structures are suitable and subgraph() is not available you can technically just capture and return the edges you traverse as the low level elements of your subgraph as described in this blog post.

Given the comments on this answer I also present the following option which used group():

gremlin> g.V('1').emit().
......1>   repeat(outE().group('a').by(outV()).by(inV().fold()).inV()).cap('a').unfold()
==>v[1]=[v[2a], v[2b]]
==>v[3a]=[v[4a]]
==>v[2a]=[v[3a]]

It's not exactly a "tree" but if you know the root (in this case v[1]) you can find its key in the Map. The values are the children. You can then look up each of those keys in the Map to find if they have children and so on. For example, we can lookup v[2b] and find that it has no children while looking up [v2a] reveals a single child of [v3a]. Gremlin can be pretty flexible in getting answers if you can be sorta flexible in how you deal with the results.

Hello Stephen, thank you so much for the post, before I could run it I wanted to know one thing about traversing the graph is that, can we capture the response. I have tried once to use project('property1','property2'..).by('property1').property('2'). this gives me flatten json as response. Can I use similar thing while traversing, selecting the property I need from a vertex and having it as an object. { firstVertex:{ properties subVertex : { properties , and so on.. } } } — Rahul Anand, Sep 30 '22 at 11:20
Also Stephen, I wanted to know cant we structure the response as parent-child, key-value from the gremlin result itself. Or is it something which I need to do in my dotnet application. The closest thing which I got was using tree(), because using path() I get multiple vertex in the first level of json object. I might be wrong, please advice . Main requirement is to read the complete graph and parse it and share it with UI which is going to bind it into a tree structure. Is it possible to achieve it ? — Rahul Anand, Sep 30 '22 at 11:56
yes, you can apply `project()` to `path()`/`tree()` by adding a `by()` modulator to either like, `path().by(project(...)...)`. Please see the TinkerPop documentation for how `by()` gets applied with those steps. If you want to maintain parent-child relationships in a singular fashion then `tree()` is probably your only option built into Gremlin aside from `subgraph()` which you can't use. — stephen mallette, Sep 30 '22 at 12:33
If you knew the depth of your tree you could `project()` each level of it with a nested query to get the exact structure you want, but there really isn't a way to use `repeat()` to do that dynamically. I updated my answer for one more option you could consider. maybe that will work for you. — stephen mallette, Sep 30 '22 at 12:34
Thank you Stephen, project did the magic. Though its taking too much of request charge on Azure, I don't know how to optimise it. I am still learning, hopefully I will optimise it later some day. — Rahul Anand, Oct 16 '22 at 17:29

Need a query to retrieve complete graph

1 Answers1