4

I am having an external api, one to download and other to upload data from and to db tables [postgresql]. Table is quite big. Over time, we observed that slowly server keeps on taking memory but doesn't remove it much, and it throws error and quits [sometimes linux closes it]. I checked the memory dump and I cannot figure out anything which I can relate to with my code. I don't use any local caching or something like that. But today I got this-

java.lang.OutOfMemoryError: GC overhead limit exceeded
        at java.sql.Timestamp.toString(Timestamp.java:350)
        at java.lang.String.valueOf(String.java:2994)
        at org.jooq.impl.AbstractParam.name(AbstractParam.java:107)
        at org.jooq.impl.AbstractParam.<init>(AbstractParam.java:81)
        at org.jooq.impl.AbstractParam.<init>(AbstractParam.java:77)
        at org.jooq.impl.Val.<init>(Val.java:63)
        at org.jooq.impl.DSL.val(DSL.java:15157)
        at org.jooq.impl.Tools.field(Tools.java:1092)
        at org.jooq.impl.Tools.fields(Tools.java:1226)
        at org.jooq.impl.BatchSingle.executePrepared(BatchSingle.java:231)
        at org.jooq.impl.BatchSingle.execute(BatchSingle.java:182)
        at org.jooq.impl.BatchCRUD.executePrepared(BatchCRUD.java:159)
        at org.jooq.impl.BatchCRUD.execute(BatchCRUD.java:100)

For fetching, I use normal fetch function and for dumping data into DB, I use JOOQ's batchInsert and batchUpdate methods. Is there any good practises with JOOQ to deal with large set of data? Am I missing something?

kunal dexit
  • 97
  • 2
  • 9

1 Answers1

7

Both fetch() and batchInsert() / batchUpdate() will buffer your content, i.e. your results and/or bind variables. This approach is not recommended when using jOOQ. Let's look at both operations individually:

Large reads

For large reads, you should avoid fetch() as this will get all the records from your database and store them in memory prior to consuming them. In most cases, this is the better approach than keeping open cursors / resources (the way JDBC does), but in cases where results are large, this is prohibitive. Instead, you should use fetchLazy() or fetchStream():

// Using fetchLazy():
try (Cursor<Record> cursor = query.fetchLazy()) {
    ...
}

// Using fetchStream():
try (Stream<Record> stream = query.fetchStream()) {
    ...
}

Don't forget to use try-with-resources to ensure all resources, such as ResultSet and PreparedStatement are closed when you finish.

Do note there is also ResultQuery.fetchSize(), which allows for specifying the JDBC fetch size to prevent the PostgreSQL JDBC driver from buffering too many rows as well.

Large writes

You shouldn't write huge inserts in one go in any database. Apart from the client memory issues that you may run into, there is also a risk of server-side issues related to growing UNDO/REDO logs. Ideally, you should partition your write into small chunks by fine-tuning:

  • bulk size
  • batch size
  • commit size

jOOQ knows the Loader API to import CSV/JSON/array data and to take care of the above fine-tuning. It is documented here:

http://www.jooq.org/doc/latest/manual/sql-execution/importing

Community
  • 1
  • 1
Lukas Eder
  • 211,314
  • 129
  • 689
  • 1,509