4

I'm running the latest neo4j v2, with the spatial plugin installed. I have managed to index almost all of the nodes I need indexed in the geo index. One of the problems that I'm struggling with is how can I easily check if a node is already been indexed ?

I can't find any REST endpoint to get this information and not easy to get to this with cypher. But I tried this query as it seems to give me the result I want except that the runtime is unacceptable.

MATCH (a)-[:RTREE_REFERENCE]->(b) where b.id=989898 return b;

As the geo index only store a reference to the node that has been indexed in a property value of id in a node referenced by the relationship RTREE_REFERENCE I figured this could be the way to go.

This query takes now: 14459 ms run from the neo4j-shell.

My database is not big, about 41000 nodes, that I want to add to the spatial index in total.

There must be a better way to do this. Any idea and or pointer would be greatly appreciated.

pixeltom
  • 1,799
  • 1
  • 15
  • 19

2 Answers2

1

Since you know the ID of your data node, you can access it directly in Cypher without an index, and just check for the incoming RTREE_REFERENCE relationship:

START n=node(989898) MATCH (p)-[r:RTREE_REFERENCE]->(n) RETURN r;

As a side node, your Cypher had the syntax 'WHERE n.id=989898' but if this is an internal node ID, then that will not work, since n.id will look for a property with key 'id'. For the internal node id, use 'id(n)'.

If your 'id' is actually a node property (and not it's internal ID), then I think @deemeetree suggestion is better, using an index over this property.

Craig Taverner
  • 759
  • 4
  • 5
  • This still won't work as the node that is indexed in the neo4j-spatial index don't have a direct connection. So in this case node(989898) does not have a direct relationship r:RTREE_REFERECE. When a node is added to the spatial index a new node is created with an "id" property that points back to the node I want indexed. It seems to me there doesn't exist a good way today to do this check with cypher. – pixeltom Feb 24 '14 at 15:44
0

Right now your requests seems to be scouring through all the nodes in the network which are related with :RTREE_REFERENCE and checking id property for each of them.

Why don't you try to instead start your search from the node id you need and then get all the paths like that?

I also don't quite understand why you need to return the node that you're defining, but anyway.

As you're running Neo4J I recommend you to add labels to your nodes (all of them in the example below):

START n=node(*) SET n:YOUR_LABEL_NAME

then create an index on the labeled node by id property.

CREATE INDEX ON :YOUR_LABEL_NAME(id)

Once you've done that, run a query like this:

MATCH (b:YOUR_LABEL_NAME{id:"989898"}), a-[:RTREE_REFERENCE]->b RETURN a,b;

That should increase the speed of your query.

Let me know if that works and please explain why you were querying b in your original question if you already knew it...

Aerodynamika
  • 7,883
  • 16
  • 78
  • 137
  • I can't really do it quite like that either. Maybe my question is not clear enough. This index is what is generated by the neo4j-spatial plugin so I can't label the node that references the node that I'm indexing. At least as far as I know. I know that scouring through the whole lot is not the way to go and that is why i'm asking whether there exist a better way to do this. – pixeltom Feb 17 '14 at 16:00
  • The main point @deemeetree is making is that your node will have an incoming RTREE_REFERENCE only if it is already indexed. From your original question I'm assuming you are working with a node and need to know if it is indexed. Your code seems to be re-searching for your node first, before checking the incoming relationship. deemeetree suggestion is to use an index on that. That can work, but since you know the ID of your node, you do not need an index. See my separate answer for a simpler suggestion. – Craig Taverner Feb 21 '14 at 12:53