java.lang.OutOfMemoryError: Java heap space with Mongodb aggregate Query

Question

While inserting the documents the code is failing with the below Java Heap space in OpenshiftContainer and it is working fine in local environment.

The code is throwing the java.lang.OutOfMemoryError: Java heap space at size = collectionDB.aggregate(pipeline).into(results).size() .After removing the size line it is working fine .It looks like the issue is with size syntax of code. However, I need to get the size of the pipeline. Can any one help me how to resolve the heap memory issue and how to get the size of aggregate pipeline with the Java Heap memory issue.

List<Document> results = new ArrayList<>();
int size ;
**size = collectionDB.aggregate(pipeline).into(results).size();**

Syntax of code

List<? extends Bson> pipeline  = Arrays.asList(
        new Document()
                .append("$match", new Document()
                        .append("key", "value")
                        .append("AmountType", "value2")
                        .append("Period", Period2)
                ),
        new Document()
                .append("$addFields", new Document()
                        .append("ID", "$_id")
                        .append("Period", value)   
                ),
);
AggregateIterable<Document> response =  collectionDB.aggregate(pipeline).allowDiskUse(false);
LOG.info("log: "+response.toString().length());
List<Document> results = new ArrayList<>();
int size ;
size = collectionDB.aggregate(pipeline).into(results).size();
LOG.info("size: "+size);
for (Document dbObject : response)
{
    collectionDB.insertOne(dbObject);
}

Exception:

Exception in thread "http-nio-8105-Acceptor" Exception in thread "http-nio-8105-ClientPoller" java.lang.OutOfMemoryError: Java heap space
java.lang.OutOfMemoryError: Java heap space
    at sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:263)
    at org.apache.tomcat.util.net.NioEndpoint.serverSocketAccept(NioEndpoint.java:463)
    at org.apache.tomcat.util.net.NioEndpoint.serverSocketAccept(NioEndpoint.java:73)
    at org.apache.tomcat.util.net.Acceptor.run(Acceptor.java:95)
    at java.lang.Thread.run(Thread.java:748)
2022-02-24 09:48:53.550 ERROR 1 --- [nio-8105-exec-7] o.a.c.c.C.[.[.[/].[dispatcherServlet]    : Servlet.service() for servlet [dispatcherServlet] in context with path [] threw exception [Handler dispatch failed; nested exception is java.lang.OutOfMemoryError: Java heap space] with root cause

java.lang.OutOfMemoryError: Java heap space
    at java.lang.StringCoding.decode(StringCoding.java:215) ~[na:1.8.0_262]
    at java.lang.String.<init>(String.java:463) ~[na:1.8.0_262]
    at java.lang.String.<init>(String.java:515) ~[na:1.8.0_262]
    at org.bson.io.ByteBufferBsonInput.readString(ByteBufferBsonInput.java:160) ~[bson-3.8.2.jar!/:na]
    at org.bson.io.ByteBufferBsonInput.readCString(ByteBufferBsonInput.java:139) ~[bson-3.8.2.jar!/:na]
    at org.bson.BsonBinaryReader.readBsonType(BsonBinaryReader.java:123) ~[bson-3.8.2.jar!/:na]
    at org.bson.codecs.DocumentCodec.decode(DocumentCodec.java:149) ~[bson-3.8.2.jar!/:na]
    at org.bson.codecs.DocumentCodec.decode(DocumentCodec.java:45) ~[bson-3.8.2.jar!/:na]
    at org.bson.codecs.configuration.LazyCodec.decode(LazyCodec.java:47) ~[bson-3.8.2.jar!/:na]
    at org.bson.codecs.DocumentCodec.readValue(DocumentCodec.java:222) ~[bson-3.8.2.jar!/:na]
    at org.bson.codecs.DocumentCodec.decode(DocumentCodec.java:151) ~[bson-3.8.2.jar!/:na]
    at org.bson.codecs.DocumentCodec.decode(DocumentCodec.java:45) ~[bson-3.8.2.jar!/:na]
    at com.mongodb.operation.CommandResultArrayCodec.decode(CommandResultArrayCodec.java:52) ~[mongodb-driver-core-3.8.2.jar!/:na]

I would rather suspect your problem is `into(results)`. You're loading all documents into memory. It is not clear to me why you need to know the size. You can also just count while inserting and log afterwards. Also, you're currently executing the query twice, which seems rather inefficient, and I'm pretty sure you can do everything within MongoDB without querying for data and then inserting it back. — Mark Rotteveel, Feb 25 '22 at 19:43
@Mark Rotteveel , Thank you for your reply. My query is modifying the documents and need to the know the count of documents modified .Can you please let me know how can count the documents before inserting .Thanks — Rose, Feb 28 '22 at 06:25

score 0 · Answer 1 · answered Feb 25 '22 at 19:21

0

Isn't the pipeline size the same as the results size?

 List<Document> results = new ArrayList<>();
 collectionDB.aggregate(pipeline).into(results);
 LOG.info("size: "+results.size());

The other idea is to increase the heapspace: Java heap space with MongoDB

answered Feb 25 '22 at 19:21

Gregu

16
1

While the additional `size` variable may be unnecessary that kind of misses the point, which is that OP is getting an `OutOfMemoryError`. – TastyWheat Feb 25 '22 at 20:21
Maybe he has really a lot of data. Also aggregate is lazy operation. I don't know if its connected but in his listing he has .allowDiskUse(false) used previously. Maybe it's connected with this setting. – Gregu Feb 27 '22 at 20:28

score 0 · Answer 2 · answered Feb 25 '22 at 20:42

It's entirely possible your local machine actually has more memory available than the container. I'm guessing your local machine has 8-32 GB of memory available and containers with that much memory allotted are very uncommon.

From a quick skim of the MongoDB Javadocs I'd attempt something like this:

int size = StreamSupport.stream(
        collectionDB.aggregate(pipeline).spliterator(), false /* true may work as well */
    ).count();

I'd be surprised though, if the API had no way of its own to return a count so I expect my answer to be sub-optimal.

I tried the approach suggested by you but still it is throwing the Heap Memory issue .The approach is not giving any results . Can you please suggest how to get number of modified documents — Rose, Mar 02 '22 at 13:56

java.lang.OutOfMemoryError: Java heap space with Mongodb aggregate Query

2 Answers2