2

This is regarding neo4j csv import using LOAD csv. Suppose my csv file format is as following.

Id, OID, name, address, Parents , Children
1, mid1, ratta, hello@aa, ["mid250","mid251","mid253"], ["mid60","mid65"]
2, mid2, butta, ado@bb, ["mid350","mid365","mid320", "mid450","mid700"], ["mid20","mid25","mid30"]
3, mid3, natta, hkk@aa, ["mid50","mid311","mid543"], []

So the parents and children columns consists of mids basically..while importing csv into neo4j using LOAD CSV.. I want to create following nodes and relationships.

  1. NODES for each rows (for each id column in csv)

  2. [:PARENT] relationship by matching the OID property in each row and OID properties inside parents column. So as a example when processing the first row...there should be four nodes (mid1, mid250,mid 251 and mid 253) and 3 PARENT relationship between mid1 and other 3 nodes.

  3. [: CHILD ] relationship by matching the OID property in each row and OID properties inside children column.

Please help!!

Tried doing it with for each function but the results didn't come correctly. Im doing it through a python script. just need to edit the cypher query.

def create_AAA(tx):
    tx.run(
        "LOAD CSV WITH HEADERS FROM 'file:///aaa.csv' AS row MERGE (e:AAA {id: row._id,OID: row.OID,address: row.address,name: row.name})"
    )

def create_parent(tx):
    tx.run(
        "LOAD CSV WITH HEADERS FROM 'file:///aaa.csv' AS row MERGE (a:AAA {OID: row.OID}) FOREACH (t in row.parents | MERGE (e:AAA {OID:t}) MERGE (a)-[:PARENT]->(e) )"
    )

def create_child(tx):
    tx.run(
        "LOAD CSV WITH HEADERS FROM 'file:///aaa.csv' AS row MERGE (a:AAA {OID: row.OID}) FOREACH (t in row.children | MERGE (e:AAA {OID:t}) MERGE (a)-[:CHILD]->(e) )"
    )

with driver.session() as session:
    session.write_transaction(create_AAA)
    session.write_transaction(create_parent)
    session.write_transaction(create_child)
Nirmana
  • 33
  • 3
  • Can you show us that have you tried? – Tomaž Bratanič Nov 29 '22 at 12:49
  • Yes sure. I've edited the code. can you check. The problem here is when creating nodes the OID property returns as mid1 ....whereas when creating relationships OID property comes like this -> ['mid250']. So when creating relationships it creates another duplicate node. – Nirmana Nov 29 '22 at 15:10
  • When you read the parents and children data from csv, it will not be converted to a list unless you use a function to conversion from strng to list. – jose_bacoy Nov 30 '22 at 01:46

2 Answers2

1

Please follow the instructions below:

  1. Change the column names of Parents and Children into parents and children since neo4j is case sensitive.
  2. Remove the spaces in your csv file so that you don't need to do trim () on each columns in the csv.
  3. In your parents and children columns, remove the commas on the string list because it is causing an error. OR use another delimiter and not comma. In my example, I used space as delimiter.
  4. Below script will remove the quotes and [] characters then convert the string list into a list (using split() function)
  5. Do the same for create child function.
LOAD CSV WITH HEADERS FROM 'file:///aaa.csv' AS row 
MERGE (a:AAA {OID: row.OID}) 
FOREACH (t in split(replace(replace(replace(row.parents,'[', ''),']', ''),'"', ''), ' ') | 
           MERGE (e:AAA {OID:t}) MERGE (a)-[:PARENT]->(e) )

See sample csv here:

Id,OID,name,address,parents,children
1,mid1,ratta,hello@aa,["mid250" "mid251" "mid253"],["mid60" "mid65"]
2,mid2,butta,ado@bb,["mid350" "mid365" "mid320" "mid450" "mid700"],["mid20" "mid25" "mid30"]
3,mid3,natta,hkk@aa,["mid50" "mid311" "mid543"],[]

See sample result here: enter image description here

jose_bacoy
  • 12,227
  • 1
  • 20
  • 38
  • Thank you very much. It works well when implementing in neo4j. But when I use that cypher query inside the above given python function (create_parent) it doesnt work. it shows an syntax error. SyntaxError: unterminated string literal (detected at line 39) – Nirmana Nov 30 '22 at 05:55
  • There is an issue..suppose in some rows the parent column is empty, then it still creates a node with OID property blank – Nirmana Nov 30 '22 at 10:20
  • Before doing FOREACH add this line, WITH row where row.children <> "[]" – jose_bacoy Nov 30 '22 at 11:25
  • It didnt work..you mean like this right..It creates some additional nodes. – Nirmana Nov 30 '22 at 11:55
0
LOAD CSV WITH HEADERS FROM 'file:///aaa.csv' AS row
WITH row WHERE row.children <>"[]"  
MERGE (a:AAA {OID: row.OID}) 
FOREACH (t in split(replace(replace(replace(row.children,'[', ''),']', ''),'"', ''), ' ') | 
MERGE (e:AAA {OID:t}) MERGE (a)-[:CHILD]->(e) )

Now it works fine.

enter image description here

Nirmana
  • 33
  • 3
  • open your CSV file and see what value you put on children column when it is empty. When I tried the updated query, it works for me. LOAD CSV WITH HEADERS FROM 'file:///aaa.csv' AS row WITH row where row.children <> "[]" RETURN count(row) as row_count This will return 2 rows only instead of three. – jose_bacoy Nov 30 '22 at 13:28
  • Ya.. I've tried for the same example CSV of yours.. I'll try again then and will ask if there is any prob..Thank you. – Nirmana Nov 30 '22 at 13:41
  • 1
    Sorted it...my mistake..have to put WITH row where row.children <> "[]" as the second line of thew code..not the third. Thank you very much. I'll edit the previous results. – Nirmana Nov 30 '22 at 15:05
  • Glad that it worked for you! Enjoy your day. – jose_bacoy Nov 30 '22 at 15:06
  • As it’s currently written, your answer is unclear. Please [edit] to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Dec 06 '22 at 01:29