3

I want to Display a tree-like subgraph showing all paths starting from a single node. Most connections are bidirectional(2 edges at the moment probably change to 1 edge instead for better performance) and I have different types of relationships. But theres one special sequence of nodes and relations leading into even more loops and this sequence shouldn't be displayed in the resulting graph.

MATCH p=(n)-[r*0..4]-(m) WHERE n.CAT='a1' RETURN p

This Query is nearly solving the problem, but I couldnt find/create the correct query to leave the sequence out of the graph. The depth is still limited because with depth [r*0..9] neo4j web app stopped responding (not the main problem! I already know how to improve/fix this) and it took a while to stop the database. The Sequence is creating the most unwanted permutations and tons of possible paths and loops. There is no way to change the relations(except using 1 edge instead of 2, practically its bidirectional but as far as i know the operation should still be the same and less complex, correct me know if I'm wrong) between the nodes because with another starting node the relation is potentially needed.

Sequence:

(x)-[:relation2]-(y)-[:relation2]-(z)

and x,y,z all having the same propertie value at CAT='variable'. It's always the same relation and the same propertie. The values and the nodes vary.

All nodes are potential starting nodes.

Also the query has to be very very fast dealing with long paths(different lenghts) splitting sometimes. Most splits will be ignored with the sequence exlcuded. The depth has to be unlimited. The resulting paths will always end as soon the query ignores the sequence and doesnt "go" this path.

To prevent misunderstanding:

x,y,z1 and z3 are sharing the same property CAT(category)='a'and z2.CAT='b' for example.

display dont display and stop

(x)-[:relation2]-(y)-[:relation2]-(z1)

(x)-[:relation2]-(y)-[:relation2]-(z2)

(x)-[:relation2]-(y)-[:relation1]-(z3)

x.CAT = y.CAT = z1.CAT = z3.CAT != z2.CAT

Performance of the query is very important and the reason for me doing this, but first I need "simple" solution to progress on this project.

Thanks a lot in advance.

Yoshi
  • 141
  • 2
  • 16
  • So let me see if I got this right, Given 3 parameters (Start node, CAT value and Relation type) You want the start node, and all children(recursive) that have CAT value, and only considering Relation type edges? (In the display part, shouldn't z1 show and z2 not be shown?) – Tezra Jun 22 '17 at 17:56
  • The only problem is Relationtype 2 coming twice in a row and the nodes connected with them sharing the same Category also the algorithm has to stop going the path after noticing the special case because after this the possible paths explode! With ignoring the sequence as a paths i have around 20 paths as result. With the special case accepted and considered as path i have over 1360 paths as result. The query has to be fast and considering the special case as path is no option. – Yoshi Jun 22 '17 at 19:29
  • Wait, is the special case that each -[r]-(n) in the path have to have a new relationtype or CAT? In your example, `(x)-[:relation2]-(y)-[:relation2]-(z1)` is the only one that is uniform CAT and Relation, but that is the only one that stops at y. Is that correct or do you want the reverse of that? – Tezra Jun 22 '17 at 19:38
  • Does each ` -[r]-(n)` need to be new, or just different from the one before it? – Tezra Jun 22 '17 at 19:40
  • everything is allowed except this one case where nodes have 3 times same category and the nodes connected with `relation2`. One node interupting the sequence with a different CAT is fine. The nodes after are allowed to have the same CAT again as long as it is not leading into the mentioned case where the one node is followed by 2 other with same CAT value connected with `relation2`. Other relationships are allowed to have 3 times in a row nodes with same CAT. – Yoshi Jun 22 '17 at 21:10
  • Its done with `if((x)-[:relation2]-(y)-[:relation2]-(z) && x.CAT==y.CAT==z.CAT)` all other combinations/paths are totally fine. But I dont know a query to filter this case and prevent neo4j to go these paths(stop as soon as detected). Is it possible without extracting them after the work is done? – Yoshi Jun 22 '17 at 21:19
  • Raw Cypher can't do this kind of logic. For complex logic like this, you will need to either use APOC or the Traversal API. Are either of those an option for you? – Tezra Jun 23 '17 at 00:58
  • Do they perform well? I could also write my own traversal algorithm in java. What do you think is the best option for good performance? And im testing all of this with Community Edition. – Yoshi Jun 23 '17 at 06:56
  • [APOC](https://github.com/neo4j-contrib/neo4j-apoc-procedures) is an installable feature that adds functions you can use in Cypher. However, Cypher is built on top of the Traversal API. So anything Cypher can do, the Traversal API can do just as well. In general, the T-API will usually be faster than Cypher as you can write a Java algorithm to do any complex logic you want. It's more work, but it gives you much more control. Here is a link to the [docs](https://neo4j.com/docs/java-reference/current/#tutorial-traversal). – Tezra Jun 23 '17 at 12:20
  • Thanks a lot!! I will take a look at the Traversal API first. – Yoshi Jun 26 '17 at 06:08
  • The T-API is the key for sure. Thanks again! – Yoshi Jun 26 '17 at 07:08
  • I tried to make a custom Evaluator. But somehow it still doesnt work for me. Do you have any expierience with `evaluate(Path path)` ? – Yoshi Jun 26 '17 at 14:26
  • 1
    I have note used the T-API yet, So all I can recommend is to parse through the [docs](http://neo4j.com/docs/java-reference/current/javadocs/org/neo4j/graphdb/traversal/Evaluator.html) and look for online [tutorials/examples](https://maxdemarzi.com/2015/09/04/flight-search-with-the-neo4j-traversal-api/). You should probably post that as a new question. – Tezra Jun 26 '17 at 17:46
  • I already read them before. I finally have the right paths. Now i just need to figure out how i get them back in to neo4j. I will try to generate a huge query out of the paths. You helped me a lot with your advice to you use the T-API. I was convinced it is possible in cypher. – Yoshi Jun 28 '17 at 06:34

2 Answers2

1

For this you should create a custom traversal. There is an API for that in Neo4j, take a look at the documentation : https://neo4j.com/docs/java-reference/current/#tutorial-traversal

To not traverse twice a node, you should use the NODE_PATH as uniqueness rule (in cypher, it's a RELATIONSHIP_PATH uniqueness to avoid graph cycle).

Cheers

logisima
  • 7,340
  • 1
  • 18
  • 31
  • ye i did this as you can see in the comments and the problem im facing now is [here](https://stackoverflow.com/questions/44796982/neo4j-display-subgraph-based-on-multiple-paths). Summary: How do you get the results of the traverser back to a graph? – Yoshi Jun 28 '17 at 11:49
  • A graph is just composed of an array of nodes and an array of relationships. Like I said in the comment on the other post, the `apoc.path.subgraphAll` give you those arrays. You can take a look at the code https://github.com/neo4j-contrib/neo4j-apoc-procedures/blob/3.1/src/main/java/apoc/path/PathExplorer.java#L72-L75 – logisima Jun 28 '17 at 12:02
  • But how to filter the sequence? I can see there is a relationshipfilter but how to filter the sequence? Its a combination out of relationships and node properties. I dont know how you can achieve this with existing apoc functions. – Yoshi Jun 28 '17 at 12:16
0

For this you should create a custom traversal. There is an API for that in Neo4j, take a look at the documentation : https://neo4j.com/docs/java-reference/current/#tutorial-traversal

To not traverse twice a node, you should use the NODE_PATH as uniqueness rule (in cypher, it's a RELATIONSHIP_PATH uniqueness to avoid graph cycle).

As @Tezra and @logisima pointed out the Traversal API is the key. But its not done with uniqueness or anything this complex. This finally printed out the results i was looking for:

TraversalDescription td = graphDb.traversalDescription()
.depthFirst()
.uniqueness(Uniqueness.RELATIONSHIP_GLOBAL)
.evaluator( new Evaluator()
{
    @Override
    public Evaluation evaluate( final Path path )
    {
        Node parent=null,grandParent =null;
        Relationship rel1=null,rel2=null;
        int nCount=0,rCount=0;
        
        if(path.length()>=2)
        {
            for(Node node : path.reverseNodes())
            {
                if(nCount==1)
                    parent = node;
                if(nCount==2)
                    grandParent=node;
                if(nCount>2)
                    break;
                nCount++;
            }
            for(Relationship rel : path.reverseRelationships())
            {
                if(rCount==0)
                    rel1 = rel;
                if(rCount==1)
                    rel2=rel;
                if(rCount>1)
                    break;
                rCount++;
            }
        }
        if(path.length()<2)
        {
            return Evaluation.INCLUDE_AND_CONTINUE;
        }
        else if (path.length()>=2 
                && rel1.isType(relType)
                && rel2.isType(relType)
                && path.endNode().getProperty("CATEGORY").toString().equals(parent.getProperty("CATEGORY").toString())
                && path.endNode().getProperty("CATEGORY").toString().equals(grandParent.getProperty("CATEGORY").toString()))
        {
            return Evaluation.EXCLUDE_AND_PRUNE;
        }
        else
        {
            return Evaluation.INCLUDE_AND_CONTINUE;
        }
    }
});
try ( Transaction tx = graphDb.beginTx() )
{
    Traverser traverser = td.traverse(graphDb.getNodeById(id));
    for ( Path path : traverser)
    {
        System.out.println(path.toString());
                }
    tx.success();
}

(special thanks to @Tezra bringing me on the right track on time)

A solution with apoc should also be possible and also avoid the problem of getting the graph back into neo4j.

Community
  • 1
  • 1
Yoshi
  • 141
  • 2
  • 16