1

I'm trying to set labels to nodes to appropriately characterize them as they are created via a LOAD CSV statement (see an earlier question I had during this process: SET label based on data within LOAD CSV). I've been able to set a label in the creation stage, but now I want to apply one of multiple labels, each with it's own set of conditions, and each mutually exclusive.

So, if I have the following in my CSV file:

"username","id","latitude","longitude","placeLatitude","placeLongitude"
"abc123","111111111","33.223","33.223"
"abc456","222222222","","","33.223","33.223"

I would want user abc123 to be labeled as "geo" and user abc456 to be labeled as "placegeo". This appears to be something that Neo4j/Cypher should be able to do quite easily, but seems like the limitation exists in the fact that I want all of the decision-making to be done all in one statement.

I've been trying the following code:

LOAD CSV WITH HEADERS FROM "file:d:/Users.csv" AS line
CREATE (p:Person { username: line.username, id: line.id, statusLat: line.statusLat, statusLon: line.statusLon, placeLat: line.placeLat, placeLon: line.placeLon })
WITH p, CASE WHEN (line.statusLat <> "" AND line.statusLat <> "0.0") THEN [1] ELSE [] END AS geotagged
FOREACH (a IN geotagged | SET p:Geo)
WITH p, CASE WHEN (COUNT(LABELS(p)) = 1 AND line.placeLat <> "" AND line.placeLat <> "0.0") THEN [1] ELSE [] END as place
FOREACH (b IN place | SET p:placegeo);

I have also tried:

LOAD CSV WITH HEADERS FROM "file:d:/Users.csv" AS line
CREATE (p:Person { username: line.username, id: line.id, statusLat: line.statusLat, statusLon: line.statusLon, placeLat: line.placeLat, placeLon: line.placeLon })
WITH p, CASE WHEN (line.statusLat <> "" AND line.statusLat <> "0.0") THEN [1] ELSE [] END AS geotagged
FOREACH (a IN geotagged | SET p:Geo)
WITH p, COUNT(LABELS(p)) AS lb WHERE lb = 1,
CASE WHEN (line.placeLat <> "" AND line.placeLat <> "0.0") THEN [1] ELSE [] END as place
FOREACH (b IN place | SET p:placegeo);

(Among other variations) and it seems I can't chain WITH/CASE statements together.

Essentially, I want to evaluate the first set of conditions and if it meets it, add the label, STOP and move on to the next row in the CSV file. Otherwise move on to the next set of conditions and repeat. If there was a "STOP" or "EXIT" command that I could issue within the SET, that would be fine.

I can't nest the FOREACH because of the mutual exclusivity (i.e. if it's one label, it's not the other).

Is there anyway to do this?

Community
  • 1
  • 1
Brooks
  • 7,099
  • 6
  • 51
  • 82

1 Answers1

2

You were pretty close. This should work for you:

LOAD CSV WITH HEADERS FROM "file:d:/Users.csv" AS line
CREATE (p:Person { username: line.username, id: line.id, statusLat: line.statusLat, statusLon: line.statusLon, placeLat: line.placeLat, placeLon: line.placeLon })
WITH p,
    CASE WHEN (line.statusLat <> "" AND line.statusLat <> "0.0") THEN [1] ELSE [] END AS geotagged,
    CASE WHEN ((line.statusLat = "" OR line.statusLat = "0.0") AND line.placeLat <> "" AND line.placeLat <> "0.0") THEN [1] ELSE [] END as place
FOREACH (a IN geotagged | SET p:Geo)
FOREACH (b IN place | SET p:placegeo);
cybersam
  • 63,203
  • 6
  • 53
  • 76
  • that will exit out of the CASE block once it matches those specific conditions? I have nodes which will evaluate to true for BOTH sets of conditions (i.e. the statusLat AND plateLat conditions), however I would only want it to attempt the first CASE block and if it's true, exit out. Is that how this will work? – Brooks Apr 03 '15 at 16:37
  • After I saw your first answer and responded, I had a feeling that would be the solution. Is that the ONLY way to do this? That means, each label I implement will need to check for the unique conditional clauses of EACH case before it.....No other way to do this? – Brooks Apr 03 '15 at 16:50
  • It's the only answer I have come up with. – cybersam Apr 03 '15 at 16:52