Neo4J huge performance degradation after records added to spatial layer

Question

So I have around 70 million spatial records that i want to add to the spatial layer (I've tested with a small set and everything is smoothly, queries returning the same results as postgis and the layer operation seems fine) But when i try to add all the spatial records to the database, the performance degrades rapidly to the point that it gets really slow at around 5 million (around 2h running time) records and hangs at ~7.7 million (8 hours lapsed).

Since the spatial index is an Rtree that uses the graph structure to construct itself, i am wondering why is it degrading when the number os records increase. Rtree insertions are O(n) if im not mistaken and thats why im concerned it might be something between the rearranging of bounding boxes, nodes that are not tree leaves that are causing the addToLayer process to get slower over time.

Currently im adding nodes to the layer like that (lots of hardcoded stuff since im trying to figure out the problem before patterns and code style):

Transaction tx = database.beginTx();
    try {

        ResourceIterable<Node> layerNodes = GlobalGraphOperations.at(database).getAllNodesWithLabel(label);
        long i = 0L;
        for (Node node : layerNodes) {
            Transaction tx2 = database.beginTx();
            try {
                layer.add(node);
                i++;
                if (i % commitInterval == 0) {
                    log("indexing (" + i + " nodes added) ... time in seconds: "
                            + (1.0 * (System.currentTimeMillis() - startTime) / 1000));
                }
                tx2.success();
            } finally {
                tx2.close();
            }
        }
        tx.success();
    } finally {
        tx.close();
    }

Any thoughts ? Any ideas of how performance could be increased ?

ps.: using java API Neo4j 2.1.2, Spatial 0.13 Core i5 3570k @4.5Ghz, 32GB ram dedicated 2TB 7200 hard drive to the database (no OS, no virtual memory files, only the data itself)

ps2.: All geometries are LineStrings (if thats important :P) they represent streets, roads, etc..

ps3.: the nodes are already in the database, i only need to add them to the Layer so that i can perform spatial queries, bbox and wkb attributes are OK, tested and working for a small set.

Thank you in advance

After altering and running the code again (which takes 5hours only to insert the points into the database, no layer involved) this happened, will try to increase the jvm heap and the embeddedgraph memory parameters.

indexing (4020000 nodes added) ... time in seconds: 8557.361
Exception in thread "main" org.neo4j.graphdb.TransactionFailureException: Unable to commit transaction
    at org.neo4j.kernel.TopLevelTransaction.close(TopLevelTransaction.java:140)
    at gis.CataImporter.addDataToLayer(CataImporter.java:263)
    at Neo4JLoadData.addDataToLayer(Neo4JLoadData.java:138)
    at Neo4JLoadData.main(Neo4JLoadData.java:86)
Caused by: javax.transaction.SystemException: Kernel has encountered some problem, please perform neccesary action (tx recovery/restart)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
    at org.neo4j.kernel.impl.transaction.KernelHealth.assertHealthy(KernelHealth.java:61)
    at org.neo4j.kernel.impl.transaction.TxManager.assertTmOk(TxManager.java:339)
    at org.neo4j.kernel.impl.transaction.TxManager.getTransaction(TxManager.java:725)
    at org.neo4j.kernel.TopLevelTransaction.close(TopLevelTransaction.java:119)
    ... 3 more
Caused by: javax.transaction.xa.XAException
    at org.neo4j.kernel.impl.transaction.TransactionImpl.doCommit(TransactionImpl.java:560)
    at org.neo4j.kernel.impl.transaction.TxManager.commit(TxManager.java:448)
    at org.neo4j.kernel.impl.transaction.TxManager.commit(TxManager.java:385)
    at org.neo4j.kernel.impl.transaction.TransactionImpl.commit(TransactionImpl.java:123)
    at org.neo4j.kernel.TopLevelTransaction.close(TopLevelTransaction.java:124)
    at gis.CataImporter.addDataToLayer(CataImporter.java:256)
    ... 2 more
Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
    at org.neo4j.kernel.impl.nioneo.store.DynamicRecord.clone(DynamicRecord.java:179)
    at org.neo4j.kernel.impl.nioneo.store.PropertyBlock.clone(PropertyBlock.java:215)
    at org.neo4j.kernel.impl.nioneo.store.PropertyRecord.clone(PropertyRecord.java:221)
    at org.neo4j.kernel.impl.nioneo.xa.Loaders$2.clone(Loaders.java:118)
    at org.neo4j.kernel.impl.nioneo.xa.Loaders$2.clone(Loaders.java:81)
    at org.neo4j.kernel.impl.nioneo.xa.RecordChanges$RecordChange.ensureHasBeforeRecordImage(RecordChanges.java:217)
    at org.neo4j.kernel.impl.nioneo.xa.RecordChanges$RecordChange.prepareForChange(RecordChanges.java:162)
    at org.neo4j.kernel.impl.nioneo.xa.RecordChanges$RecordChange.forChangingData(RecordChanges.java:157)
    at org.neo4j.kernel.impl.nioneo.xa.PropertyCreator.primitiveChangeProperty(PropertyCreator.java:64)
    at org.neo4j.kernel.impl.nioneo.xa.NeoStoreTransactionContext.primitiveChangeProperty(NeoStoreTransactionContext.java:125)
    at org.neo4j.kernel.impl.nioneo.xa.NeoStoreTransaction.nodeChangeProperty(NeoStoreTransaction.java:1244)
    at org.neo4j.kernel.impl.persistence.PersistenceManager.nodeChangeProperty(PersistenceManager.java:119)
    at org.neo4j.kernel.impl.api.KernelTransactionImplementation$1.visitNodePropertyChanges(KernelTransactionImplementation.java:344)
    at org.neo4j.kernel.impl.api.state.TxStateImpl$6.visitPropertyChanges(TxStateImpl.java:238)
    at org.neo4j.kernel.impl.api.state.PropertyContainerState.accept(PropertyContainerState.java:187)
    at org.neo4j.kernel.impl.api.state.NodeState.accept(NodeState.java:148)
    at org.neo4j.kernel.impl.api.state.TxStateImpl.accept(TxStateImpl.java:160)
    at org.neo4j.kernel.impl.api.KernelTransactionImplementation.createTransactionCommands(KernelTransactionImplementation.java:332)
    at org.neo4j.kernel.impl.api.KernelTransactionImplementation.prepare(KernelTransactionImplementation.java:123)
    at org.neo4j.kernel.impl.transaction.xaframework.XaResourceManager.prepareKernelTx(XaResourceManager.java:900)
    at org.neo4j.kernel.impl.transaction.xaframework.XaResourceManager.commit(XaResourceManager.java:510)
    at org.neo4j.kernel.impl.transaction.xaframework.XaResourceHelpImpl.commit(XaResourceHelpImpl.java:64)
    at org.neo4j.kernel.impl.transaction.TransactionImpl.doCommit(TransactionImpl.java:548)
    ... 7 more

28/07 -> Increasing memory did not help, now im testing some modifications in the RTreeIndex and LayerRTreeIndex (what exactly does the field maxNodeReferences does ?

// Constructor

public LayerRTreeIndex(GraphDatabaseService database, Layer layer) {
    this(database, layer, 100);     
}

public LayerRTreeIndex(GraphDatabaseService database, Layer layer, int maxNodeReferences) {
    super(database, layer.getLayerNode(), layer.getGeometryEncoder(), maxNodeReferences);
    this.layer = layer;
}

It is hardcoded to 100, and changing its value changes when (number of nodes added wise) my addToLayer method crashes into OutOfMemory error, If im not mistaken, changing that field's value increases or decreases the tree's width and depth (being 100 wider than 50, and 50 being deeper than 100).

To summarize the progress so far:

Incorrect usage of transactions corrected by @Jim
Memory Heap increased to 27GB following @Peter 's advice
3 spatial layers to go, but now the problem gets real because they're the big ones.
Did some memory profiling while adding nodes to the spatial layer and i found interesting points.

Memory and GC profiling: http://postimg.org/gallery/biffn9zq/

The type that uses the most memory througout the entire process is the byte[], which i can only assume belongs to the geometries wkb properties (either the geometry itself or the rtree's bbox). Having that in mind, I also noticed (you can check on the new profiling images) that the ammount of heap space used never goes below the 18GB mark.

According to this question are java primitives garbage collected primitive types in java are raw data, therefore not being subjected to garbage collection, and are only freed from the method's stack when the method returns (so maybe when i create a new spatial layer, all those wkb byte arrays will remain in memory until I manually close the layer object).

Does that make any sense ? isnt there a better way to manage memory resources so that the layer doesnt keep unused, old data loaded ?

Here's some profiling info, dont know if it helps [CPU time methods](http://s28.postimg.org/hn6dnpdgt/neo4j_cputime_methods.png) [Self time methods](http://s28.postimg.org/q6prlglt9/neo4j_selftime_methods.png) [CPU time classes](http://s28.postimg.org/hxdw71s31/neo4j_cputime_classes.png) — catacavaco, Jul 26 '14 at 18:48
Google indicates there are some issues in the Neo4j rtree implementation. Looks like there might be a bug. — Has QUIT--Anony-Mousse, Jul 27 '14 at 09:25
Looks like im facing some weird behavior when adding nodes to the spatial layer, usually it crashes with that exception when getting close to 4 million nodes — catacavaco, Jul 28 '14 at 20:18

score 2 · Answer 1 · answered Jul 26 '14 at 18:45

2

Catacavaco,

You are doing each add as a separate transaction. To make use of your commitInterval, you need to change your code to something like this.

Transaction tx = database.beginTx();

try {
    ResourceIterable<Node> layerNodes = GlobalGraphOperations.at(database).getAllNodesWithLabel(label);

    long i = 0L;

    for (Node node : layerNodes) {
        layer.add(node);
        i++;

        if (i % commitInterval == 0) {
            tx.success();
            tx.close();

            log("indexing (" + i + " nodes added) ... time in seconds: "
                + (1.0 * (System.currentTimeMillis() - startTime) / 1000));

            tx = database.beginTx();
        }
    }

    tx.success();
} finally {
    tx.close();
}

See if this helps.

Grace and peace,

Jim

answered Jul 26 '14 at 18:45

Jim Biard

2,252
11
14

I did change that to test if my bottleneck was IO, disk usage or something related but it was strictly CPU usage, going to change it back to test but not much hope. – catacavaco Jul 26 '14 at 18:52
And the commit interval should be something between 10k and 50k – Michael Hunger Jul 26 '14 at 21:12
I've applied Jim's suggestions, commit interval set to 10k, still slowing down, maybe the issue is not related with the transactions, from 0 to around 4-5 million the insert rate (in which i mean adding an existing geometry to the spatiallayer) goes by .6 seconds a batch (or roughly 16k nodes per second) while after the 5 million mark until the 7 million (where it completely hangs) it gradually drops until 10-20 geometries per second. – catacavaco Jul 26 '14 at 21:27
Maybe if I query the nodes sorting them by the bounding box area (which is easy to calculate even without the spatial layer), i could get the bigger ones first so that later insertions into the RTree would not cause that many quadratic splits and bbox adjustments – catacavaco Jul 26 '14 at 21:35
I'm looking forward to hearing what you find next! – Jim Biard Jul 27 '14 at 01:56
Thanks @Jim, the batch insertion is now following the commitInterval without too many transactional overhead, that part of the problem was solved but still too much memory is being used while adding nodes to the spatial layer. – catacavaco Jul 31 '14 at 21:15
Would like to confirm this idiom. I find it suspicious that the call to GlobalGraphOperations.at().getAllNodesWithLabel(), which requires a transaction, is being started in one transaction and continued over many others. Also, a node that is read from the iterator in one transaction is mutated in another. Is this all OK? Thanks. – Paul Jackson May 19 '15 at 13:01
The node read from the iterator is not mutated. A relationship from a parent RTree node to the node being added is created, but that is all that is done. I haven't gone into depth with the Iterable and the transactions, but while it does appear from the documentation to be an incorrect usage, I am guessing that all that is closed when each sub-transaction is ended is a hold on the nodes referenced by the Iterable. The information about node id, etc, must still be valid as long as there aren't other workers in the database, which I believe was the case in this instance. – Jim Biard May 20 '15 at 19:18
(continued from above) This sort of approach was recommended by Michael Hunger of Neo4j in another question (http://stackoverflow.com/questions/19745725/neo4j-handaling-a-long-transation-on-one-iterator). There are no nested transactions, so no help there. – Jim Biard May 20 '15 at 19:26

score 2 · Answer 2 · edited May 23 '17 at 12:22

2

Looking at Error java.lang.OutOfMemoryError: GC overhead limit exceeded, there might be some excessive object creation going on. From your profiling results it doesn't look like it, could you double check?

edited May 23 '17 at 12:22

Community

1
1

answered Jul 29 '14 at 12:18

Peter Neubauer

6,311
1
21
24

Re-checked the behavior when importing the last few layers (the bigger ones) and it seemed that operations exploding index's nodes into more than one bbox, combined with the size of the heap (10GB) made the GC work nonstop cleaning the mess after every batch. The same index in PostGIS (which uses a different kind of Rtree) fills up to 7GB (just the index), with that in mind, the solution was to increase even further the heap size, now running at minimum 8GB and max 27GB, GC activity has dropped to nearly 0% (still running, because the data is large, im currently at 30GB out of ~200GB) – catacavaco Jul 30 '14 at 04:31
maybe the performance degradation came when the heap was completely full and the GC had to work overtime to make room for the new data being read from the shapefiles – catacavaco Jul 30 '14 at 04:32
This would be a really good test use-case for high performance and volume imports. – Michael Hunger Jul 30 '14 at 09:38
Only 3 spatial layers left (sadly they're the massive ones 32,64 and 92GB of shapefiles respectively) updated the question with more information – catacavaco Jul 31 '14 at 21:19

score 1 · Accepted Answer · answered Aug 15 '14 at 19:16

Finally solved the problem with three fixes: setting cache_type=none increasing heap size for neostore low level graph engine and setting use_memory_mapped_buffers=true so that memory management is done by the OS and not the slowish JVM

that way, my custom batch insertion in the spatial layers went much faster, and without any errors/exceptions

Thanks for all the help provided, i guess my answer is just a combination of all the tips people provided here, thanks very much

Neo4J huge performance degradation after records added to spatial layer

3 Answers3

Linked