I'm trying to write a Gremlin query to find a list of traversed vertices and edges (with their properties), returning the most complex (i.e. highest count) of a vertex based on the starting vertex.
In other words, I want to retrieve the patients with the most codes, but there is not a direct relationship between Patients and Codes. This is the relationship and direction: Patient->Diagnosis<-Code
Here is my attempt:
g.V().hasLabel('Patient').
outE().inV().
inE().outV().
path().
by(elementMap()).
order().
by(count(local), asc).
tail(2).
unfold().
toList()
I wanted this to return patient vertices with their traversed edges/vertices, only the top 2 based on the count of codes returned per patient. This is what I got:
single patient vertex with traversed edges/nodes
Here is sample insert to replicate the same relationships:
g
.addV('pat').property(id, 'p-0')
.addV('pat').property(id, 'p-1')
.addV('pat').property(id, 'p-2')
.addV('diag').property(id, 'd-0')
.addV('diag').property(id, 'd-1')
.addV('diag').property(id, 'd-2')
.addV('code').property(id, 'c-0')
.addV('code').property(id, 'c-1')
.V('p-0').addE('contracted').to(V('d-0'))
.V('p-0').addE('contracted').to(V('d-1'))
.V('p-0').addE('contracted').to(V('d-2'))
.V('p-1').addE('contracted').to(V('d-1'))
.V('p-2').addE('contracted').to(V('d-2'))
.V('c-0').addE('includes').to(V('d-0'))
.V('c-1').addE('includes').to(V('d-0'))
.V('c-1').addE('includes').to(V('d-1'))
.V('c-2').addE('includes').to(V('d-1'))
This is an example of the format I would like to return:
I used ".path().by(elementMap()).unfold().toList()" after the vertex and edge steps to get this.
I want the output to be the vertices and edges that will produce a graph like this:
As you can see, out of three patients, I want to return the top 2 most complex patients (based on the number of codes their diagnoses have). I don't want to return the patient with just one code.