0

I need to write a little program in Java that read a huge numbers of records from a postgreSQL database, convert them to JSON and write them elsewhere.

Obviously, I can't load all the records at the same time (500000 rows) if I don't want to get a Out of Memory Error.

What's the best way to load say 1000 records in memory, process and go to the next 1000 ? Someone has an idea ?

corem
  • 3
  • 3
  • Please note that there are many similar questions, but another one is [this one](http://stackoverflow.com/questions/5067619/jpa-what-is-the-proper-pattern-for-iterating-over-large-result-sets) – V G Jan 29 '15 at 14:28

1 Answers1

4

By default, the Postgres JDBC driver loads the complete ResultSet into memory. This can be prevented by turning off autocommit and enabling a cursor based retrieval by setting a fetch size that is greater than 0.

Details are in the manual: https://jdbc.postgresql.org/documentation/head/query.html#query-with-cursor

So if you do the following:

Connection con = ....; // open the connection
con.setAutoCommit(false);
Statement stmt = con.createStatement();
stmt.setFetchSize(1000);
ResultSet rs = stmt.executeQuery("select ...");
while (rs.next()) 
{
   // do something with the row
}

This will never load more than 1000 rows into memory. The amount of rows held in memory can be controlled by changing the fetch size.

I don't know JPA, but the controlling autocommit should be straigh forward, but I don't know how to control the fetch size through JPA. Probably also depends on your JPA provider.