0

I know similar questions to this have been asked many times before, but even having tried many of the solutions given, I'm still seeing this problem.

Our application allows tech users to create parameterised raw SQL querys to extract data from the DB which is downloaded to an excel spreadsheet.

For smaller datasets this works fine, however, when the file size starts approaching 10mb+ I start hitting this issue.

The datasets could potentially be 100k rows or 80-90mb in size. I don't want to increase the JVM heap size if possible.

Hopefully there is a glaring error in my code that I haven't spotted. The resultSet.next() loop seems to be the source of the issue. Is there a more efficient way to write this to stop gobbling heap space?

Any help much appreciated. Thanks

/*
*
 * query is a raw sql query that takes parameters (using Mybatis)
 * criteriaMap the arguments that we subsitute into the query
 * 
*/

public List<Map<String, Object>> queryForJsonWithoutMapping(final String query, final Map<String, Object> criteriaMap){

SqlSession sqlSession = getSqlSessionInstance();

    String sql = "";
    Connection connection = null;
    PreparedStatement pstmt = null;
    ResultSet resultSet = null;

    try {

        final Configuration configuration = getSqlSessionInstance().getConfiguration();

        SqlSourceBuilder builder = new SqlSourceBuilder(configuration);

        SqlSource src = builder.parse(query, Map.class, null);

        BoundSql boundSql = src.getBoundSql(criteriaMap);

        sql = boundSql.getSql();

        List<ParameterMapping> parameterMappings = boundSql.getParameterMappings();

        connection = sqlSession.getConnection();

        pstmt = connection.prepareStatement(sql, java.sql.ResultSet.TYPE_FORWARD_ONLY,  java.sql.ResultSet.CONCUR_READ_ONLY);

        // this function subs the params into the preparedStatement query
        buildParams(parameterMappings, criteriaMap, pstmt);

        resultSet = pstmt.executeQuery();

        // the while loop inside this function is where things start to hang
        List<Map<String, Object>> results = getObjectFromResultSet(resultSet);

        return results;

    } catch (Exception e) {
        LOG.error(e.getMessage(), e);
        LOG.error(ExceptionUtils.getStackTrace(e));
        throw new IllegalStateException(sql + " " + e.getMessage(), e);
    } finally {
        try{
            connection.close();
            pstmt.close();
            resultSet.close();
        }catch (SQLException e){
            e.printStackTrace();
        }
        sqlSession.close();
    }

private List<Map<String, ?>> getEntitiesFromResultSet(ResultSet resultSet) throws SQLException {
        ArrayList<Map<String, ?>> entities = new ArrayList<>(resultSet.getFetchSize());
        int index = 0;
        Map<String, Object> jsonObject;
        while (resultSet.next()) {
            jsonObject = getEntityFromResultSet(resultSet);
            entities.add(index, jsonObject);
            index ++;
        }
        resultSet.close();
        return entities;
    }

    private List<Map<String, Object>> getObjectFromResultSet(ResultSet resultSet) throws SQLException {
        ArrayList<Map<String, Object>> entities = new ArrayList<>(resultSet.getFetchSize());
        int index = 0;
        Map<String, Object> jsonObject;
        while (resultSet.next()) {
            jsonObject = getEntityFromResultSet(resultSet);
            entities.add(index, jsonObject);
            index ++;
        }
        resultSet.close();
        return entities;
    }

DB is oracle

shanahobo86
  • 467
  • 2
  • 7
  • 23
  • 3
    Looks like a traditional "read everything into memory, then start doing something". You could stream out the results as you read, it'll be a lot more efficient. – Kayaman Dec 08 '16 at 19:13
  • entities.add(index, jsonObject); is the cause of this. its utilizing memory at a high rate – Zuko Dec 08 '16 at 21:22

2 Answers2

2

Getting and processing all rows from a DB table in one go is a bad idea. You need to implement generic idea of Pagination i.e. you read and process one page (n = page-size rows) at a time.

Your page size should be optimal enough that you don't make too many DB hits and at the same time not to have too many records in memory.

JdbcPagingItemReader of Spring Batch API implements this concept.

Refer this SO Question to get more ideas on pagination with JDBC.

In addition to that, you shouldn't keep increasing the size of your Map results. You need to flush this map in cycles.

Hope this helps !!

Community
  • 1
  • 1
Sabir Khan
  • 9,826
  • 7
  • 45
  • 98
0

In such a design, you will inevitable run out of memory at some point if the result of the query returns large amount of data because you're loading the entire ResultSet in memory. Instead you could simply state that you getXXXFromResultSet APIs have a threshold in terms of amount of data. For every row you calculate its size and decide if you can add it to your JSON doc. If you've passed the threshold you stop there and close the ResultSet (which will cancel the execution on the server). Another option would involve streaming the results but that's more complex.

Jean de Lavarene
  • 3,461
  • 1
  • 20
  • 28