1

sorry for the vague question, as I do not know how to write it in proper sentence, if any of you wish to edit it to something more meaningful please go ahead.

I actually wanted to ask, when we do something like this:

Select * from TableWithBillionRecords Limit 1000000

using either result set:

PreparedStatement.executeQuery()

or Hibernate:

getCurrentSession().createQuery(query).list()

Once the line is executed, does it means the web server is actually pulling out the whole 1 million rows from my database server? Or is there some streaming logic lies behind the interface?

Thanks for the information as it helps me in deciding whether or not I should break it to multiple queries selecting chunk by chunk or if it is actually alright to just select it one whole bunch.

Thanks again guys.

Chor Wai Chun
  • 3,226
  • 25
  • 41
  • search for JDBC pagination – Scary Wombat Feb 29 '16 at 01:48
  • Hibernate is a higher level ORM on top of JDBC. I'm not familiar with Hibernate, but in JDBC, you can do setFetchSize() to get effective chunking. – seand Feb 29 '16 at 02:02
  • @seand setFetchSize(Integer.MIN_VALUE) looks very promising, but just to be sure, 'streaming' line by line means there's only single connection right? if it is then its indeed awesome.. – Chor Wai Chun Feb 29 '16 at 03:58

2 Answers2

1

Most of the JDBC drivers’ default fetch size is 10.

In normal JDBC programming if you want to retrieve 1000 rows it requires 100 network round trips between your application and database server to transfer all data. Definitely this will impact your application response time.

The reason is JDBC drivers are designed to fetch small number of rows from database to avoid any out of memory issues.

For example if your query retrieves 1 million rows, the JVM heap memory may not be good enough to hold that large amount of data hence JDBC drivers are designed to retrieve small number (10 rows) of rows at a time that way it can support any number of rows as long as you have better design to handle large row set at your application coding.

If you configure fetch size as 100, number of network trips to database will become 10. This will dramatically improve performance of your application.

Important note to consider when tuning fetch size:

  1. Make sure your JDBC driver supports configuring fetch size.
  2. The fetch size should be based on your JVM heap memory setting. Since JVM heap size varies between different environment, don’t hard-code fetch size keep it as configurable parameters.
  3. If your fetch size is large, you might encounter out of memory issue. For example a less number of column tables might support large rows fetch size than more number of columns.
  4. You can set fetch size based certain datatype like blob, image, etc. We follow certain naming convention for columns for example all image and blob column will have suffix “Blob”. I set high fetch size if the query doesn’t contain “Blob” word otherwise set low fetch size.

Refer: http://webmoli.com/2009/02/01/jdbc-performance-tuning-with-optimal-fetch-size/

Gurpreet Singh
  • 380
  • 4
  • 9
0

Hibernate fetches results eagerly by default, because that's what most interactions with databases tend to be like: Fetch a couple of records, display them, possibly modify them, store them again.

Fetching millions of rows is an entirely different use-case mostly applied when:

  • Doing ETL
  • Doing analytics (although you should probably do analytics in the database, not in the client)

Specifically, Hibernate doesn't offer any way to keep a lazy cursor in the database and scroll it lazily like JDBC or some other ORMs. But it doesn't need to. You can always resort to plain JDBC in Hibernate projects.

Community
  • 1
  • 1
Lukas Eder
  • 211,314
  • 129
  • 689
  • 1,509