Is Merge operation for creating relationship from big csv file with thousands of rows gets slow in neo4j?

Question

I have files those contain thousands of rows where the size of csv files are 500mb to 3.1 gb.i have firstly done bulk import it took few minutes to loads all data in graph DB. now for my project purpose, I need to upload data by regular basis. so I have written a python script using neo4j bolt driver where all regular node creates, node update, node delete performs. Creating a relationship from files also works for the small size of data(prototype). The problem occurs when I am going to create relations from large files. Though parallelism works, it gets very slow. my CPU 32 core is fully used. I have checked it through the HTOP. and for batch size 100-1000 the core is properly used. I have tried 10000-100000 batch size, in that case, parallelism does not work. here is my query code for creating load csv

"""CALL apoc.periodic.iterate('
load csv with headers from "file:///x.csv" AS row return row 
','
MERGE (p1:A {ID: row.A})
MERGE (p2:B {ID: row.B})
WITH p1, p2, row
CALL apoc.create.relationship(p1, row.RELATIONSHIP, {}, p2) YIELD rel  return rel
',{batchSize:10000, iterateList:true, parallel:true})"""

it works totally fine for a small amount of data. but it gets very slow when it deals with big size of data. for creating 10 relations it took 39sec rough. Is merge operation is inefficient at my case or I am missing some tricks here. kindly help me to solve. I am working at EC2 instance where Ram size is 240G.I a have tried warmup.run it tuned at 192G but no significant change has been observed

Considering the data size you are working on, You don't need to use periodic iterate if you have this much ram. — Rajendra Kadam, Aug 06 '19 at 20:05
Just change configuration on neo4j conf file to use more ram — Rajendra Kadam, Aug 06 '19 at 20:05
39 seconds for 10 relationships is far too slow. Do you have indexes on :A(ID) and :B(ID)? — InverseFalcon, Aug 06 '19 at 20:06
Check my answer for loading large CSV files: https://stackoverflow.com/a/56548843/6077914. And https://stackoverflow.com/a/54236141/6077914 — Rajendra Kadam, Aug 06 '19 at 20:08
@InverseFalcon no i didnt create index on :A(ID) and :B(ID) .Could any one help me how to set index in this code for better understanding . I am new in neo4j — Kalyan, Aug 07 '19 at 01:36

Is Merge operation for creating relationship from big csv file with thousands of rows gets slow in neo4j?

0 Answers0