I have a Spring Boot application that reads from a database table with potentially millions of rows and thus uses the queryForStream method from Spring Data. This is the code:
Stream<MyResultDto> result = jdbcTemplate.queryForStream("select * from table", myRowMapper));
This runs well for smaller tables, but from about 500 MB of table size the application dies with a stacktrace like this:
Exception in thread "http-nio-8080-Acceptor" java.lang.OutOfMemoryError: Java heap space
at java.base/java.nio.HeapByteBuffer.<init>(HeapByteBuffer.java:64)
at java.base/java.nio.ByteBuffer.allocate(ByteBuffer.java:363)
at org.apache.tomcat.util.net.SocketBufferHandler.<init>(SocketBufferHandler.java:58)
at org.apache.tomcat.util.net.NioEndpoint.setSocketOptions(NioEndpoint.java:486)
at org.apache.tomcat.util.net.NioEndpoint.setSocketOptions(NioEndpoint.java:79)
at org.apache.tomcat.util.net.Acceptor.run(Acceptor.java:149)
at java.base/java.lang.Thread.run(Thread.java:833)
2023-01-28 00:37:23.862 ERROR 1 --- [nio-8080-exec-3] o.a.c.h.Http11NioProtocol : Failed to complete processing of a request
java.lang.OutOfMemoryError: Java heap space
2023-01-28 00:37:30.548 ERROR 1 --- [nio-8080-exec-6] o.a.c.c.C.[.[.[.[dispatcherServlet] : Servlet.service() for servlet [dispatcherServlet] in context with path [] threw exception [Handler dispatch failed; nested exception is java.lang.OutOfMemoryError: Java heap space] with root cause
java.lang.OutOfMemoryError: Java heap space
Exception in thread "http-nio-8080-Poller" java.lang.OutOfMemoryError: Java heap space
As you can probably guess from the stack trace, I am streaming the database results out via a HTTP REST interface. The stack is PostgreSQL 15, the standard PostgreSQL JDBC driver 42.3.8 and the spring-boot-starter-data-jpa is 2.6.14, which results in spring-jdbc 5.3.24 being pulled.
It's worth noting that the table has no primary key, which I suppose should be no problem for the above query. I have not posted the RowMapper, because it never goes to work, the memory literally runs out after sending the query to the database. It just never comes back with a result set that the rowmapper could work on.
I have tried to use a jdbcTemplate.setFetchSize(1000) and also without specifying any fetch size, which I believe would result in the default being used (100 I think). In both cases the same thing happens - large result sets will not be streamed, but somehow exhaust the Java heap space before streaming starts. What could be the reason for this? Isn't the queryForStream method meant to exactly avoid such situations?