21

#Issue# We are trying to optimize our dataserver application. It stores stocks and quotes over a mysql database. And we are not satisfied with the fetching performances.

#Context# - database - table stock : around 500 lines - table quote : 3 000 000 to 10 000 000 lines - one-to-many association : one stock owns n quotes - fetching around 1000 quotes per request - there is an index on (stockId,date) in the quote table - no cache, because in production, queries are always different - Hibernate 3 - mysql 5.5 - Java 6 - JDBC mysql Connector 5.1.13 - c3p0 pooling

#Tests and results# ##Protocol##

  • Execution times on mysql server are obtained with running the generated sql queries in mysql command line bin.
  • The server is in a test context : no other DB readings, no DB writings
  • We fetch 857 quotes for the AAPL stock

##Case 1 : Hibernate with association## This fills up our stock object with 857 quotes object (everything correctly mapped in hibernate.xml)

session.enableFilter("after").setParameter("after", 1322910573000L);
Stock stock = (Stock) session.createCriteria(Stock.class).
add(Restrictions.eq("stockId", stockId)).
setFetchMode("quotes", FetchMode.JOIN).uniqueResult();

SQL generated :

SELECT this_.stockId AS stockId1_1_,
       this_.symbol AS symbol1_1_,
       this_.name AS name1_1_,
       quotes2_.stockId AS stockId1_3_,
       quotes2_.quoteId AS quoteId3_,
       quotes2_.quoteId AS quoteId0_0_,
       quotes2_.value AS value0_0_,
       quotes2_.stockId AS stockId0_0_,
       quotes2_.volume AS volume0_0_,
       quotes2_.quality AS quality0_0_,
       quotes2_.date AS date0_0_,
       quotes2_.createdDate AS createdD7_0_0_,
       quotes2_.fetcher AS fetcher0_0_
FROM stock this_
LEFT OUTER JOIN quote quotes2_ ON this_.stockId=quotes2_.stockId
AND quotes2_.date > 1322910573000
WHERE this_.stockId='AAPL'
ORDER BY quotes2_.date ASC

Results :

  • Execution time on mysql server : ~10 ms
  • Execution time in Java : ~400ms

##Case 2 : Hibernate without association without HQL## Thinking to increase performance, we've used that code that fetch only the quotes objects and we manually add them to a stock (so we don't fetch repeated info about the stock for every line). We used createSQLQuery to minimize effects of aliases and HQL mess.

String filter = " AND q.date>1322910573000";
filter += " ORDER BY q.date DESC";
Stock stock = new Stock(stockId);
stock.addQuotes((ArrayList<Quote>) session.createSQLQuery("select * from quote q where stockId='" + stockId + "' " + filter).addEntity(Quote.class).list());

SQL generated :

SELECT *
FROM quote q
WHERE stockId='AAPL'
  AND q.date>1322910573000
ORDER BY q.date ASC

Results :

  • Execution time on mysql server : ~10 ms
  • Execution time in Java : ~370ms

##Case 3 : JDBC without Hibernate##

String filter = " AND q.date>1322910573000";
filter += " ORDER BY q.date DESC";
Stock stock = new Stock(stockId);
Connection conn = SimpleJDBC.getConnection();
Statement stmt = conn.createStatement();
ResultSet rs = stmt.executeQuery("select * from quote q where stockId='" + stockId + "' " + filter);
while(rs.next())
{
    stock.addQuote(new Quote(rs.getInt("volume"), rs.getLong("date"), rs.getFloat("value"), rs.getByte("fetcher")));
}
stmt.close();
conn.close();

Results :

  • Execution time on mysql server : ~10 ms
  • Execution time in Java : ~100ms

#Our understandings#

  • The JDBC driver is common to all the cases.
  • There is a fundamental time cost in JDBC driving.
  • With similar sql queries, Hibernate spends more time than pure JDBC code in converting result sets in objects.
  • Hibernate createCriteria, createSQLQuery or createQuery are similar in time cost.
  • In production, where we have lots of writing concurrently, pure JDBC solution seemed to be slower than the hibernate one (maybe because our JDBC solutions was not pooled)
  • Mysql wise, the server seems to behave very well, and the time cost is very acceptable.

#Our questions#

  • Is there a way to optimize the performance of JDBC driver ?
  • And will Hibernate benefit this optimization ?
  • Is there a way to optimize Hibernate performance when converting result sets ?
  • Are we facing something not tunable because of Java fundamental object and memory management ?
  • Are we missing a point, are we stupid and all of this is vain ?
  • Are we french ? Yes.

Your help is very welcome.

DevThiman
  • 920
  • 1
  • 9
  • 24
  • Have you profiled this using Yourkit profiler or a similar product? Where do you lose the most speed? With TCP/IP? With object creation? Note: I doubt that a properly configured JDBC solution would ever be slower than a Hibernate one... – Lukas Eder Dec 20 '11 at 10:25
  • I agree, the pure JDBC solution was probably suffering lack of pooling. – Lucas Mouilleron Dec 20 '11 at 10:34
  • The difference in time for the pure JDBC solution could very well be the time that is needed to send the data over the network. The call `rs.getByte("fetcher")` seems to indicate you are transferring BLOB data. How big is that data? –  Dec 20 '11 at 11:28
  • @a_horse_with_no_name: You meant `rs.getBytes()`? `getByte()` shouldn't cause any issues, I guess? – Lukas Eder Dec 20 '11 at 12:11
  • @LukasEder: oh! I missed the fact that there was no `s` (blush). –  Dec 20 '11 at 12:19

1 Answers1

7

Can you do a smoke test with the simples query possible like:

SELECT current_timestamp()

or

SELECT 1 + 1

This will tell you what is the actual JDBC driver overhead. Also it is not clear whether both tests are performed from the same machine.

Is there a way to optimize the performance of JDBC driver ?

Run the same query several thousand times in Java. JVM needs some time to warm-up (class-loading, JIT). Also I assume SimpleJDBC.getConnection() uses C3P0 connection pooling - the cost of establishing a connection is pretty high so first few execution could be slow.

Also prefer named queries to ad-hoc querying or criteria query.

And will Hibernate benefit this optimization ?

Hibernate is a very complex framework. As you can see it consumes 75% of the overall execution time compared to raw JDBC. If you need raw ORM (no lazy-loading, dirty checking, advanced caching), consider mybatis. Or maybe even JdbcTemplate with RowMapper abstraction.

Is there a way to optimize Hibernate performance when converting result sets ?

Not really. Check out the Chapter 19. Improving performance in Hibernate documentation. There is a lot of reflection happening out there + class generation. Once again, Hibernate might not be a best solution when you want to squeeze every millisecond from your database.

However it is a good choice when you want to increase the overall user experience due to extensive caching support. Check out the performance doc again. It mostly talks about caching. There is a first level cache, second level cache, query cache... This is the place where Hibernate might actually outperform simple JDBC - it can cache a lot in a ways you could not even imagine. On the other hand - poor cache configuration would lead to even slower setup.

Check out: Caching with Hibernate + Spring - some Questions!

Are we facing something not tunable because of Java fundamental object and memory management ?

JVM (especially in server configuration) is quite fast. Object creation on the heap is as fast as on the stack in e.g. C, garbage collection has been greatly optimized. I don't think the Java version running plain JDBC would be much slower compared to more native connection. That's why I suggested few improvements in your benchmark.

Are we missing a point, are we stupid and all of this is vain ?

I believe that JDBC is a good choice if performance is your biggest issue. Java has been used successfully in a lot of database-heavy applications.

Community
  • 1
  • 1
Tomasz Nurkiewicz
  • 334,321
  • 69
  • 703
  • 674
  • 1
    What do you think about deactivating Hibernate's 2nd level caching (if that is even possible)? Couldn't that speed things up remarkably? – Lukas Eder Dec 20 '11 at 10:51
  • 1
    @LukasEder: Hibernate's 2nd level cache is disabled by default. I believe (although I encourage you to profile this) that most of the time is spent in reflection code in Hibernate - which you can't really avoid. – Tomasz Nurkiewicz Dec 20 '11 at 11:00
  • I didn't know it was deactivated by default, thanks for the hint. I agree with the need for profiling. – Lukas Eder Dec 20 '11 at 11:08