4

This is how I'm reading my large MongoDB table (every object has very big chunks of data in its attributes):

DBCursor cursor = collection.find(/* my query */);
while (cursor.hasNext()) {
  DBObject object = cursor.next();
  doSomething(object); // no data stays in memory
}
cursor.close();

I'm getting:

java.lang.OutOfMemoryError: Java heap space
at java.lang.StringCoding$StringDecoder.decode(Unknown Source)
at java.lang.StringCoding.decode(Unknown Source)
at java.lang.String.<init>(Unknown Source)
at org.bson.BasicBSONDecoder$BSONInput.readUTF8String(BasicBSONDecoder.java:463)
at org.bson.BasicBSONDecoder.decodeElement(BasicBSONDecoder.java:155)
at org.bson.BasicBSONDecoder._decode(BasicBSONDecoder.java:79)
at org.bson.BasicBSONDecoder.decode(BasicBSONDecoder.java:57)
at com.mongodb.DefaultDBDecoder.decode(DefaultDBDecoder.java:56)
at com.mongodb.Response.<init>(Response.java:83)
at com.mongodb.DBPort.go(DBPort.java:124)
at com.mongodb.DBPort.call(DBPort.java:74)
at com.mongodb.DBTCPConnector.innerCall(DBTCPConnector.java:286)
at com.mongodb.DBTCPConnector.call(DBTCPConnector.java:257)
at com.mongodb.DBApiLayer$MyCollection.__find(DBApiLayer.java:310)
at com.mongodb.DBApiLayer$MyCollection.__find(DBApiLayer.java:295)
at com.mongodb.DBCursor._check(DBCursor.java:368)
at com.mongodb.DBCursor._hasNext(DBCursor.java:459)
at com.mongodb.DBCursor.hasNext(DBCursor.java:484)

Exception is thrown after 200-300 objects are processed. Does the driver holds data in memory? I'm using

<dependency>
  <groupId>org.mongodb</groupId>
  <artifactId>mongo-java-driver</artifactId>
  <version>2.10.1</version>
</dependency>
yegor256
  • 102,010
  • 123
  • 446
  • 597
  • I would have thought the cursor does not keep reference to previous objects (the code is open source: you could have a look at the implementation). Possibly related: http://stackoverflow.com/questions/13531004/java-outofmemoryerror-strange-behaviour – assylias Sep 08 '13 at 17:14

1 Answers1

2

The number of objects the driver will hold in memory depends on the cursor's batch size. I'm not sure what the default batch size is when you don't set one explicitly. Depending on the maximum size of your documents and the amount of heap space available, you should set the batch size accordingly. Keep in mind that the current maximum document size supported by MongoDB is 16MB.

See DBCursor.batchSize() in the MongoDB Java driver API docs for details.

mbenoit99
  • 56
  • 2
  • Default batch size is 101 for the first fetch, subsequent batches are 4MB: http://docs.mongodb.org/manual/core/cursors/ – Trisha Oct 21 '13 at 13:07