I have two graphs: one has 15k nodes and is a subgraph of the other which has 30k nodes. To receive the smaller one I took the bigger one and deleted some nodes and their relationships. Now I did some performance issues on both graphs and did the same queries on both and I was wondering that the performance in the bigger graph is better. I do not know the reason. Here I found that the deleted nodes are reserved for future when new nodes will be inserted but is this the true reason? I am using version 2.1.2.
Asked
Active
Viewed 103 times
0
-
Have you warmed up caches for both graphs before doing measurements? http://docs.neo4j.org/chunked/stable/configuration-caches.html is a good read. – Stefan Armbruster Jun 30 '14 at 13:47
-
could you please also paste those example queries you are using for comparison? or the whole comparison process – ulkas Jun 30 '14 at 13:49
-
I did not warm up the caches but in both cases. The graphs are in two different databases and first I did the measurement with the small graph then I shut down the database and loaded the other databases. The queries intended to look for 50 random values of one property (the same values in both graphs, that's possible cause the one is subgraph of the other) – d.r.91 Jun 30 '14 at 14:13
1 Answers
0
If you deleted and inserted in one go, then your deleted records are still unused.
But at your small graph size this shouldn't matter, I think you have a different problem, please share all the code / queries that run slow as well as more information about your datamodel and graph.

Michael Hunger
- 41,339
- 3
- 57
- 80
-
I use the full Cineasts Dataset (12k movies, 50k actors) and reduced it half because I wanted to compare the performance with different size data sets. The queries are kind of `MATCH (n)-[r]-() WHERE n.name = "Julia Roberts" RETURN n, count(*);` but with different actor names. – d.r.91 Jul 17 '14 at 09:35