3

Can someone help me please with this simple query...Many thanks in advance...

I am using the following gremlin query and it works well giving me the original vertex (v) (with id-=12345), its edges (e) and the child vertex (id property). However, say if the original vertex 'v' (with id-12345) has no outgoing edges, the query returns nothing. I still want the properties of the original vertex ('v') even if it has no outgoing edges and a child. How can I do that?

g.V().has('id', '12345').as('v').
  outE().as('e').
  inV().
    as('child_v').
    select('v', 'e', 'child_v').
    by(valueMap()).by(id).by(id)
noam621
  • 2,766
  • 1
  • 16
  • 26
nDev
  • 97
  • 2
  • 9

2 Answers2

2

There are a couple of things going on here but the major update you need to the traversal is to use a project() step instead of a select().

select() and project() steps are similar in that they both allow you to format the results of a traversal however they differ in (at least) one significant way. select() steps function by allowing you to access previously traversed and labeled elements (via as). project() steps allow you take the current traverser and branch it to manipulate the output moving forward.

In your original traversal, when there are no outgoing edges from original v so all the traversers are filtered out during the outE() step. Since there are no further traversers after the outE() step then remainder of the traversal has no input stream so there is no data to return. If you use a project() step after the original v you're able to return the original traverser as well as return the edges and incident vertex. This does lead to a slight complication when handling cases where no out edges exist. Gremlin does not handle null values, such as no out edges existing, you need to return some constant value for these statements using a coalesce statement.

Here is functioning version of this traversal:

g.V().hasId(3).
  project('v', 'e', 'child_v').
    by(valueMap()).
    by(coalesce(outE().id(), constant(''))).
    by(coalesce(out().id(), constant(''))) 
bechbd
  • 6,206
  • 3
  • 28
  • 47
  • 4
    Excellent explanation of the problem but the original query had a link between the edge id and the child id. In the query you have given, only one edge and one child will be returned. And even if you add `fold` you get two separate arrays. – noam621 May 18 '20 at 16:39
  • I am not sure @noam621 what you mean when you say only one edge and one child. Your queries (both) as well as abaove uses outE(). I understood that if there are 5 edges from the parent vertex, these queries will return parent properties, edge id, and the vertex id for each of the child vertexes (i.e. 5). That is what outE() gives you AFAIK...thanks – nDev May 18 '20 at 20:13
  • 2
    @nDev The `by` step "stop" on the first traversal that returning a value, so in case you don't use the `fold` step inside each `by`, even if there are 5 edges only one will be returned – noam621 May 18 '20 at 20:23
  • @nDev See the difference between the cases: https://gremlify.com/a4 – noam621 May 18 '20 at 20:30
  • OK so as I understand it, the by(coalesce(...)) will stop after the first edge so it will not traverse through other edges. In that case, which is the correct query to use out of the 3 (your second query does not use fold() either)? I was trying to get parent vertex (its properties) and all its edges (properties) and children (properties)...if there is a no edge between the parent and child, then it would return parent and perhaps an empty string or something for edge and child properties (as shown by @bechbd – nDev May 18 '20 at 20:47
  • 2
    OK I think I understand what you are looking for now. You can do this using an optional() step. Take a look here: https://gremlify.com/a7 – bechbd May 18 '20 at 20:52
  • @nDev I think my first query will probably do the best job. by avoiding fetching duplicate data, and receiving the edge id and the child id in an in one structure – noam621 May 18 '20 at 20:54
  • @noam621 Your first query gives results but only if there is an edge from the Node. I wanted the parent vertex even when there is no edge between the parent and the child. The first query does not give me anything. The query bechbd gave seems to work well in all cases. – nDev May 19 '20 at 08:35
  • @nDev The query in a7 will work great and much more elegant than my second query .. but still fetching a lot of duplicate data. the first query should work even if there aren't any child vertices or I am missing something? https://gremlify.com/aa – noam621 May 19 '20 at 08:39
  • I am not sure what duplicate data you are referring to? I tried it and there seems to be no duplicate data being returned. The query in a7 seems to work even when there is no edge between parent and child and return only the parent as I had wanted. This is because it is using "optional". Thanks to @bechd too! Both of you have given me great queries to work with an further insight into gremlin QL. Much appreciated. Can you both please update your queries above to reflect what I said. Just for other peoples use in the future! – nDev May 19 '20 at 09:29
  • 1
    @nDev I meant that the vertex properties will be returned for each child. if Marko has 5 vertices connected to him you will get the valueMap of Marko 5 times... – noam621 May 19 '20 at 11:23
  • @noam621 and bechbd, something strange when returning values in all your queries. Lets say we have A which has two children B and C. And C has D as one child. When I ask for A I get a list one for each of the two edges. When I query for say B or C or D, I get a list containing the parent vertex or the parent vertex, edge and child vertex id. The output is different. I want consistent return value where I would get a list regardless. and each element in the list will be a dictionary containing parent vertex, edge and child verted id. Just as is the case when I query for vertex A. – nDev May 19 '20 at 15:13
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/214183/discussion-between-noam621-and-ndev). – noam621 May 19 '20 at 15:19
1

Currently you will get a lot of duplicate data, in the above query you will get the vertex properties E times. probably will be better to use project:

g.V('12345').project('v', 'children').
    by(valueMap()).
    by(outE().as('e').
      inV().as('child').
        select('e', 'child').by(id).fold())

example: https://gremlify.com/a1

You can get the original data format if you do something like this:

g.V('12345').as('v').
  coalesce(
    outE().as('e').
    inV().
      as('child_v')
    select('v', 'e', 'child_v').
    by(valueMap()).by(id).by(id),
    project('v').by(valueMap())
  )

example: https://gremlify.com/a2

noam621
  • 2,766
  • 1
  • 16
  • 26