3

I am trying to use Neo4J with neomodel to represent some graph relationships. However I have performance issues when I am trying to construct a graph with millions of nodes and relationships.

When I have graph with 10k nodes and 30k relationships among them, it takes 4:20s to import it it Neo4j. It takes 1:40 to create nodes and 2:40 to create relationships with calling foo.connect(bar). It's extremely slow.

When I have used batch api provided by neomodel, I am able to create all nodes in just 4s, but it doesn't affect the time needed for relationships creation.

Neomodel is using CYPHER queries to create relationships 1 by 1. So, I have decided to write my own queries, where I first match all nodes needed for creating 100 relationships and then I create those relationships. It happened once or twice that it finished in few seconds. In other cases it again takes minutes. When I use htop to see, what is going on, I can see, that 2 cores are fully utilized by neo4j database.

I have found following article: Import 10M Stack Overflow Questions into Neo4j In Just 3 Minutes which is using neo4j-import, but I would like to avoid it.

I am using default configuration, except that I am using dbms.jvm.additional=-Xss256M to be able to execute those batch relationships queries. I have unique index over property that I am using for node lookup. Before each experiment I delete all nodes and relationships.

Do you have any idea, how to speed it up?

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
Martin Majlis
  • 363
  • 2
  • 10
  • Hi @Martin , any luck on this ? I am facing the same issue with neomodel . I am able to create the nodes with properties in bulk but not the relationships – Aashutosh Soni Jan 31 '22 at 10:58

1 Answers1

1

How many rels do your nodes have?

Usually I don't think that object mappers are good for mass insertions.

Please check out: https://medium.com/@mesirii/5-tips-tricks-for-fast-batched-updates-of-graph-structures-with-neo4j-and-cypher-73c7f693c8cc

Can you enable query logging for queries taking longer than 1 second and share the queries that neomodel generates?

dbms.jvm.additional=-Xss256M is excessive. That means every thread allocates 256M memory, usually 2M is good enough for that.

Michael Hunger
  • 41,339
  • 3
  • 57
  • 80
  • If i use his trick of unwind, how can i handle deduplicating entities? Neomodel provides a nice `unique=True` parameter which i can pass to multiple properties of a node class, i cant seem to do that with `MERGE {batch}` after an unwind since the distinguishing properties arenot defined as unique – yampelo Mar 04 '18 at 17:25
  • Hi @Michael , any luck on this ? I am facing the same issue with neomodel . I am able to create the nodes with properties in bulk but not the relationships – Aashutosh Soni Jan 31 '22 at 10:59
  • At this point in time, bulk creation of relationships doesn't seem to be possible with neomodel. It is mentioned on their repo with this issue: [Feature: add support to pass relationship properties to get_or_create and create_or_update batch operations](https://github.com/neo4j-contrib/neomodel/issues/583). You can always use cypher queries to achieve it though. They have a guide on how to do this: [neomodel cypher queries](https://neomodel.readthedocs.io/en/latest/cypher.html). – Ethan Posner Jun 17 '22 at 02:02