0

I noticed a performance difference when querying via incoming and outgoing relationships for a given node. In this case, outgoing was much faster.

The input file that generates the graph is sorted by the start node for each edge.

Does the order of the input file matter? Is there a difference in how the outgoing relationships are treated?

I read a bit of background on the internals, but didn't seem to answer my question about the difference in performance.

Jay
  • 359
  • 2
  • 14
  • Can you clarify your question? What is the input file here and how are you loading it? When you say "querying for incoming versus outgoing" how are you doing that querying? Slide 10 of this shows you how relationships are stored: http://www.slideshare.net/aliraza995/neo4j-graph-storage-27104408 – FrobberOfBits Mar 01 '15 at 15:20

1 Answers1

0

There should be no difference. There's another diagram of how things are stored in Neo4j on page 12 of my MSc Thesis.

What might be causing the difference is the fact that you're running one test (the first one) with cold caches, and the other one with warm caches. If you flip your experiment and do outgoing first, then incoming, you may find incoming is suddenly faster! That's because data is on disk the first time around, then in memory the second time.

Michal Bachman
  • 2,661
  • 17
  • 22