I want to load a csv that contains relationships between Wikipedia categories rels.csv (4 million of relations between categories). I tried to modify the setting file by changing the following parameter values:
dbms.memory.heap.initial_size=8G
dbms.memory.heap.max_size=8G
dbms.memory.pagecache.size=9G
My query is as follows:
USING PERIODIC COMMIT 10000
LOAD CSV FROM
"https://github.com/jbarrasa/datasets/blob/master/wikipedia/data/rels.csv?raw=true" AS row
MATCH (from:Category { catId: row[0]})
MATCH (to:Category { catId: row[1]})
CREATE (from)-[:SUBCAT_OF]->(to)
Moreover, I created indexes on catId and catName. Despite all these optimizations, the query still running (since yesterday).
Can you tell me if there are more optimization that should be done to load this CSV file?