Java OutOfMemoryError for opencmis method getdescendants(-1)

Question

My goal is to get all the documents from an alfresco site with 100000 documents. I used OpenCmis libraries. My problem is that with this procedure I get a java.lang.OutOfMemoryError: Java heap space.

The total size of all documents on the site is: 500GB.

This is the code:

CmisObject cmisObject = session.getObjectByPath(path);
FolderImpl sitoFolder = (FolderImpl) cmisObject;
List<Tree<FileableCmisObject>> sitoFolderDescendants = sitoFolder.getDescendants(-1);

This is my stacktrace error:

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.util.HashMap.newNode(HashMap.java:1742)
at java.util.HashMap.putVal(HashMap.java:630)
at java.util.HashMap.put(HashMap.java:611)
at org.apache.chemistry.opencmis.commons.impl.XMLWalker.handleExtensionLevel(XMLWalker.java:128)
at org.apache.chemistry.opencmis.commons.impl.XMLWalker.handleExtensionLevel(XMLWalker.java:161)
at org.apache.chemistry.opencmis.commons.impl.XMLWalker.handleExtensionLevel(XMLWalker.java:161)
at org.apache.chemistry.opencmis.commons.impl.XMLWalker.handleExtension(XMLWalker.java:112)
at org.apache.chemistry.opencmis.commons.impl.XMLWalker.walk(XMLWalker.java:58)
at org.apache.chemistry.opencmis.commons.impl.XMLConverter$18.read(XMLConverter.java:2198)
at org.apache.chemistry.opencmis.commons.impl.XMLConverter$18.read(XMLConverter.java:2188)
at org.apache.chemistry.opencmis.commons.impl.XMLWalker.walk(XMLWalker.java:56)
at org.apache.chemistry.opencmis.commons.impl.XMLConverter.convertObject(XMLConverter.java:1102)
at org.apache.chemistry.opencmis.client.bindings.spi.atompub.AtomPubParser.parseElement(AtomPubParser.java:332)
at org.apache.chemistry.opencmis.client.bindings.spi.atompub.AtomPubParser.parseEntry(AtomPubParser.java:284)
at org.apache.chemistry.opencmis.client.bindings.spi.atompub.AtomPubParser.parseFeed(AtomPubParser.java:243)
at org.apache.chemistry.opencmis.client.bindings.spi.atompub.AtomPubParser.parseChildren(AtomPubParser.java:372)
at org.apache.chemistry.opencmis.client.bindings.spi.atompub.AtomPubParser.parseElement(AtomPubParser.java:339)
at org.apache.chemistry.opencmis.client.bindings.spi.atompub.AtomPubParser.parseEntry(AtomPubParser.java:284)
at org.apache.chemistry.opencmis.client.bindings.spi.atompub.AtomPubParser.parseFeed(AtomPubParser.java:243)
at org.apache.chemistry.opencmis.client.bindings.spi.atompub.AtomPubParser.parseChildren(AtomPubParser.java:372)
at org.apache.chemistry.opencmis.client.bindings.spi.atompub.AtomPubParser.parseElement(AtomPubParser.java:339)
at org.apache.chemistry.opencmis.client.bindings.spi.atompub.AtomPubParser.parseEntry(AtomPubParser.java:284)
at org.apache.chemistry.opencmis.client.bindings.spi.atompub.AtomPubParser.parseFeed(AtomPubParser.java:243)
at org.apache.chemistry.opencmis.client.bindings.spi.atompub.AtomPubParser.parseChildren(AtomPubParser.java:372)
at org.apache.chemistry.opencmis.client.bindings.spi.atompub.AtomPubParser.parseElement(AtomPubParser.java:339)
at org.apache.chemistry.opencmis.client.bindings.spi.atompub.AtomPubParser.parseEntry(AtomPubParser.java:284)
at org.apache.chemistry.opencmis.client.bindings.spi.atompub.AtomPubParser.parseFeed(AtomPubParser.java:243)
at org.apache.chemistry.opencmis.client.bindings.spi.atompub.AtomPubParser.parseChildren(AtomPubParser.java:372)
at org.apache.chemistry.opencmis.client.bindings.spi.atompub.AtomPubParser.parseElement(AtomPubParser.java:339)
at org.apache.chemistry.opencmis.client.bindings.spi.atompub.AtomPubParser.parseEntry(AtomPubParser.java:284)
at org.apache.chemistry.opencmis.client.bindings.spi.atompub.AtomPubParser.parseFeed(AtomPubParser.java:243)
at org.apache.chemistry.opencmis.client.bindings.spi.atompub.AtomPubParser.parse(AtomPubParser.java:109)

Try increasing the size of the available memory pool with [the Xmx option](http://docs.oracle.com/javase/8/docs/technotes/tools/windows/java.html#BABHDABI), or failing that, buy a bigger computer. — President James K. Polk, Jul 01 '17 at 14:34
Is not a solution increase Xmx option. But I need understand how much memory needs the: getDescendants method. — G.Dileo, Jul 01 '17 at 14:35
There may be enough 6 GB for a site of 100,000 documents that have a total size of 500GB? — G.Dileo, Jul 01 '17 at 14:38

score 3 · Answer 1 · answered Jul 01 '17 at 16:42

Don't use getDescendants(-1)! If you really, really need getDescendants(), use an operation context that only selects the properties you need and turns off Allowable Actions and ACLs. See http://chemistry.apache.org/docs/cmis-samples/samples/operation-context/index.html .

score 2 · Answer 2 · answered Jul 01 '17 at 14:49

I do not think that it is a good idea to get all the nodes in the same time.

CMIS has several ways to paginate a query. With pagination you can retrieve a predefined number of documents at a time and then free the memory.

See for example Apache CMIS: Paging query result

Java OutOfMemoryError for opencmis method getdescendants(-1)

2 Answers2