3

I'm using Neo4j 2.2.0 and importing data (in the form of a nodes file and relationships file) via LOAD CSV.

The nodes will all be imported under the "Person" label, however I want to add the "Geotag" label to some of them if their latitude and longitude fields in the nodes file are being empty.

So, for example, the below nodes file (ignore the extra line in between rows)

"username","id","latitude","longitude"

"abc123","111111111","33.223","33.223"

"abc456","222222222","",""

I would like to create node "abc123" with the Person and Geotag labels and node abc456 with just the Person label because it doesn't have a latitude and longitude.

I thought this would be something along the lines of:

LOAD CSV WITH HEADERS FROM "file:/users.csv" AS line 
CREATE (p:Person { username: line.username, id: line.id, latitude: line.latitude, longitude: line.longitude }) 
SET p: (CASE WHEN line.latitude IS NOT NULL THEN GEOTAGGED);

I know I am using the CASE statement incorrectly as well as the SET statement, but is this possible to do while importing the nodes? This file has over 3 million nodes in it and it would be helpful to do it upon insertion so that when new nodes get added (usually in batches), we're not exploring all nodes just to get to the new ones.

I've explored other SO questions (How to set relationship type and label in LOAD CSV?, Loading relationships from CSV data into neo4j db, Neo4j Cypher - creating nodes and setting labels with LOAD CSV), however they differ from my question in that those OP's are trying to use a field in the file as the label and I am simply trying to make a conditional decision on which labels to use based on data in the file.

Thanks!

EDIT: In response to an answer, I am trying the following:

LOAD CSV WITH HEADERS FROM "file:/users.csv" AS line
CREATE (p:Person { username: line.username, id: line.id, latitude: line.latitude, longitude: line.longitude }) 
CASE WHEN line.latitude IS NOT NULL THEN [1] ELSE [] END AS geotagged 
FOREACH (x IN geotagged | SET p:Geotag); 

I get the following error:

QueryExecutionKernelException: Invalid input 'A': expected 'r/R' (line 3, column 2 (offset: 454)) "CASE WHEN line.latitude IS NOT NULL THEN [1] ELSE [] END AS geotagged"

With the carrot under the 'A' in "CASE"

EDIT2:

Below is the complete solution, inspired by and only slightly different from David's solution.

LOAD CSV WITH HEADERS FROM "file:/users.csv" AS line
CREATE (p:Person { username: line.username, id: line.id, latitude: line.latitude, longitude: line.longitude }) 
WITH p, CASE WHEN line.latitude <> "" THEN [1] ELSE [] END AS geotagged 
FOREACH (x IN geotagged | SET p:Geotag); 
Community
  • 1
  • 1
Brooks
  • 7,099
  • 6
  • 51
  • 82

1 Answers1

1

you are close. You cannot put the conditional logic in the set label statement. You need to create a collection of 1 to iterate through when you have a not null lon/lat value. Then iterate through the collection of 1 and perform the statement there.

...
case when line.latitude IS NOT NULL then [1] else [] end as geotagged
foreach(x in geotagged | set p:Geotag)
...
Dave Bennett
  • 10,996
  • 3
  • 30
  • 41
  • Bennet, thanks. Perhaps I am misunderstanding, I am still getting an error, I added it to my original post. – Brooks Apr 02 '15 at 18:59
  • sorry...I also assumed that when you wrote "case when CASE WHEN", it was a typo, but I tried it both ways and still no luck. – Brooks Apr 02 '15 at 19:15
  • To be clear, the above syntax as-it-is, does not work, however got me on the right track. To correct it, simply put "WITH p, " in front of "CASE WHEN". Exact and complete solution in the OP. – Brooks Apr 02 '15 at 20:00
  • thanks - removed the superfluous "case when". I started typing and then cut and paste from your query - my bad. – Dave Bennett Apr 02 '15 at 21:40