1

I have a neo4j running on an ec2 instance (large, ubuntu) and I'm running some scripts on it that do lots of writings.

I noticed that after a while that those scripts run (after they wrote couple of thousand nodes) the server starting to run very slow, sometimes to the point it get absolutely stuck. another weird part - resetting the instance from this situation usually ends up in the server taking much longer than usual to init.

first I suspected that neo4j uses up all the RAM and this is a paging problem, but I've read that neo4j calculates dynamically the heap size and stack size limits. also I checked memory usage with top and it looked like most of the RAM was unused, except for Java process occasionally popping up, taking few GBs and then disappear quickly, which I assumed was neo4j.

anyway here's my question(s): do I need to config neo4j server and/or wrapper, or should I let neo4j calculate it dynamically on its own? and did someone encountered something like I described and have any idea what could cause it?

thanks!

Ronen Ness
  • 9,923
  • 4
  • 33
  • 50

1 Answers1

2

It's been my experience that you definitely need to tweak the memory settings to your needs. The neo4j manual has a whole section on it:

http://neo4j.com/docs/stable/configuration.html

I've not really heard of neo4j automatically adjusting to your server's memory capabilities, though just last night I did run across what seemed like a new configuration variable in conf/neo4j.properties:

# The amount of memory to use for mapping the store files, either in bytes or
# as a percentage of available memory. This will be clipped at the amount of
# free memory observed when the database starts, and automatically be rounded
# down to the nearest whole page. For example, if "500MB" is configured, but
# only 450MB of memory is free when the database starts, then the database will
# map at most 450MB. If "50%" is configured, and the system has a capacity of
# 4GB, then at most 2GB of memory will be mapped, unless the database observes
# that less than 2GB of memory is free when it starts.
#mapped_memory_total_size=50%
Brian Underwood
  • 10,746
  • 1
  • 22
  • 34
  • hi Brian, thank you for the answer. just to be clear what I meant, I opened the neo4j wrapper config file and saw the following comment: # Java Heap Size: by default the Java heap size is dynamically # calculated based on available system resources. # Uncomment these lines to set specific initial and maximum # heap size in MB. #wrapper.java.initmemory=512 #wrapper.java.maxmemory=512 – Ronen Ness Dec 23 '14 at 13:46
  • so you think I should config the neo4j-wrapper, the neo4h.properties, or both? thanks :) – Ronen Ness Dec 23 '14 at 13:49
  • 1
    I've not touched the `neo4j-wrapper.properties` before, but I'd say you'd definitely want to configure some memory limits / memory mapped ID settings in the `neo4j.properties` – Brian Underwood Dec 24 '14 at 10:30
  • so I tried several mixtures of configurations but nothing works. I think the bottleneck might be amazon IO and not neo4j after all. but I need some additional research. thanks :) – Ronen Ness Dec 24 '14 at 11:14
  • 1
    How are you writing? Are you using Cypher or Java? What else are you doing when writing, i.e. are you looking up stuff before the writes? Could you post some sample queries? Did you try batching the writes? – Michal Bachman Dec 28 '14 at 16:45
  • for this specific script I'm using the py2neo module, which (I think) uses the REST API. basically I iterate over all the nodes in the db, read them one by one and update their properties. so to sum it up: I read and update nodes, using REST api, and not by batches. I expect it not to be very fast, but the slowness to the point of getting stuck is still weird. PS meanwhile the freezing problem stopped, have no idea why, but it still getting slower then usual sometimes. :/ – Ronen Ness Dec 30 '14 at 12:58