3

My problem is this: I am trying to process about 1.5 million rows of data in Spring via JDBCTemplate coming from MySQL. With such a large number of rows, I am using the RowCallbackHandler class as suggested here

The code is actually working, but's SLOW... The thing is that no matter what I set the fetch size to, I seem to get approximately 350 records per fetch, with a 2 to 3 second delay between fetches (from observing my logs). I tried commenting out the store command and confirmed that behavior stayed the same, so the problem is not with the writes.

There are 6 columns, only 1 that is a varchar, and that one is only 25 characters long, so I can't see throughput being the issue.

Ideally I'd like to get more like 30000-50000 rows at a time. Is there a way to do that?

Here is my code:


    protected void runCallback(String query, Map params, int fetchSize, RowCallbackHandler rch) 
            throws DatabaseException {
        int oldFetchSize = getJdbcTemplate().getFetchSize();
        if (fetchSize > 0) {
            getJdbcTemplate().setFetchSize(fetchSize);
        }
        try {
            getJdbcTemplate().query(getSql(query), rch);
        }
        catch (DataAccessException ex) {
            logger.error(ExceptionUtils.getStackTrace(ex));
            throw new DatabaseException( ex.getMessage() );         
        }
        getJdbcTemplate().setFetchSize(oldFetchSize);
    }

and the handler:

public class SaveUserFolderStatesCallback implements RowCallbackHandler {
        @Override
        public void processRow(ResultSet rs) throws SQLException {
            //Save each row sequentially.
            //Do NOT call ResultSet.next() !!!!

            Calendar asOf = Calendar.getInstance();
            log.info("AS OF DATE: " + asOf.getTime());
            Long x = (Long) rs.getLong("x");
            Long xx = (Long) rs.getLong("xx");
            String xxx = (String) rs.getString("xxx");
            BigDecimal xxxx = (BigDecimal)rs.getBigDecimal("xxxx");
            Double xxxx = (budgetAmountBD == null) ? 0.0 : budgetAmountBD.doubleValue();
            BigDecimal xxxxx = (BigDecimal)rs.getBigDecimal("xxxxx");
            Double xxxxx = (actualAmountBD == null) ? 0.0 : actualAmountBD.doubleValue();           

            dbstore(x, xx, xxx, xxxx, xxxxx, asOf);
        }       

    }
Community
  • 1
  • 1
ControlAltDel
  • 33,923
  • 10
  • 53
  • 80

3 Answers3

0

And what is your query? Try to create an indexex for fields you are searching/sorting. That will help.

Second strategy: in memory cache implementation. Or using of hibernate plus 2nd level cache.

Both this technics can significantly speed up your query execution.

alexey28
  • 5,170
  • 1
  • 20
  • 25
  • for #1 just doing a select * on the table, so with no sorting an index isn't going to help for #2 changing dbs is not an option – ControlAltDel May 08 '12 at 18:51
  • memmory cache don't need from you to change db – alexey28 May 10 '12 at 07:16
  • For memory cache implementation you will have some Service with lazy initializing collection (if you just do select without any where/order). Init this collection with your query result when first time executed and for other calls just set result from memory. – alexey28 May 10 '12 at 08:48
0

Few Questions

  1. How long does it takes if you query the DB directly. Another issue could be ASYNC_NETWORK_IO delay between application and DB hosts.

  2. did you check it without using Spring

Chandra
  • 333
  • 1
  • 15
  • I'm sorry I thought I had it but I don't. I don't think this is it, but how do I inspect ASYNC_NETWORK_IO? – ControlAltDel May 08 '12 at 18:50
  • SQLServer provides DMVs to query the waittypes as in here http://www.confio.com/db-resources/sql-wait-types/async-network-io/ but i am not aware what is the equivalent in mysql. Did you check how long it takes if you query directly from the console? That should give you fair idea whether the client application is an issue . Then you can check whether the Network or Spring is causing the issue – Chandra May 09 '12 at 07:33
0

The answer actually is to do setFetchSize(Integer.MIN_VALUE) while this totally violates the stated contract of Statement.setFetchSize, the mysql java connector uses this value to stream the resultset. This results in tremendous performance improvement.

Another part of the fix is that I also needed to create my own subclass of (Spring) JdbcTemplate that would accomodate the negative fetch size... Actually, I took the code example here, where I first found the idea of setting fetchSize(Integer.MIN_VALUE)

http://javasplitter.blogspot.com/2009/10/pimp-ma-jdbc-resultset.html

Thank you both for your help!

ControlAltDel
  • 33,923
  • 10
  • 53
  • 80