12

Is there a way to iterate through every node in a neo4j database using py2neo?

My first thought was iterating through GraphDatabaseService, but that didn't work. If there isn't a way to do it with py2neo, is there another python interface that would let me?

Edit: I'm accepting @Nicholas's answer for now, but I'll update it if someone can give me a way that returns a generator.

beardc
  • 20,283
  • 17
  • 76
  • 94

3 Answers3

13

I would suggest doing that with asynchronous Cypher, something like:

    from py2neo import neo4j, cypher

    graph_db = neo4j.GraphDatabaseService()

    def handle_row(row):
        node = row[0]
        # do something with `node` here

    cypher.execute(graph_db, "START z=node(*) RETURN z", row_handler=handle_row)

Of course you might want to exclude the reference node or otherwise tweak the query.

Nige

Nigel Small
  • 4,475
  • 1
  • 17
  • 15
  • 1
    Thanks, looks like this works. I'm assuming for a large graph it won't load all of them into python memory at once, correct? – beardc Jun 19 '12 at 01:38
  • Correct. The asynchronous Cypher execution submits each row for handling as it's received from the HTTP response stream. – Nigel Small Jun 19 '12 at 05:47
  • 2
    As of py2neo 1.6 (due for release October 2013) this will be possible with a streamed set of Cypher query results and standard Python iteration. – Nigel Small Sep 06 '13 at 08:53
  • It gives me the error `TypeError: is not JSON serializable`. What would be the equivalent for recent versions of py2neo? – montefuscolo Feb 01 '16 at 22:45
4

One of two solutions come to mind. Either do a cypher query

START n=node(*) return n

The other, and I'm not familiar with python so I'm going to give the example in Java is

GlobalGraphOperations.at(graphDatabaseService).getAllNodes()

which is the way the the old deprecated graphDatabaseService.getAllNodes() recommends.

BenMorel
  • 34,448
  • 50
  • 182
  • 322
Nicholas
  • 7,403
  • 10
  • 48
  • 76
  • Thanks. Executing the cypher query `START n=node(*) return n` returns a list, but couldn't find an analog to your second answer. Now accepting answers that return generators. – beardc Jun 18 '12 at 14:08
  • I have considered several options for implementing a generator to iterate though all nodes in the database. Unfortunately, I don't think there is a way to achieve this without either (i) keeping the HTTP connection open until the application code has iterated through all items or (ii) loading all items into memory beforehand. The key issue with the generator approach is that that traversal is necessarily controlled by the code _using_ the generator instead of that _providing_ it. This is why I feel the callback mechanism is preferable for this purpose. – Nigel Small Aug 17 '12 at 11:03
4

For newer versions of py2neo the accepted version no longer works. Instead use:

from py2neo import Graph

graph = Graph("http://user:pass@localhost:7474/db/data/")

for n in graph.cypher.stream("START z=node(*) RETURN z"):
    //do something with node here
    print n
  • 1
    Looks like this one doesn't work in py2neo 4 as it gives error as below `AttributeError: 'Graph' object has no attribute 'cypher'` :( – vinit payal Jan 27 '20 at 06:18