0

I'm trying to fetch more than 1 million records in the microservices. I'm getting an error. I need to keep increasing the app memory for the service in order to fetch a huge amount of data. Is there any method I can use to fetch huge data instead of increasing the app memory.

2022-10-11T11:22:04.898+08:00 [APP/PROC/WEB/0] [ERR] Resource exhaustion event: the JVM was unable to allocate memory from the heap. 2022-10-11T11:22:04.898+08:00 [APP/PROC/WEB/0] [ERR] ResourceExhausted! (1/0)



        @Value("${batch-size}")
        private int batchSize;
    
    public void archiveTableRecords(JdbcTemplate sourceDbTemplate, JdbcTemplate targetDbTemplate,
        ArchiveConfigDTO archiveObj) {
            try {
                String sourceTable = archiveObj.getSourceTable();
                String archive_months =archiveObj.getArchiveCriteriaMonths();
                List<Object> primaryKeyValueList = new ArrayList<>();
                String compareDate1 = getCSTDateNew(archive_months);
                logger.info("Archive criteria date: {}", compareDate1);
                List<Map<String, Object>> sourceRecords = sourceDbTemplate
                        .queryForList(ArchiveSQLQueries.buildSQLQueryToFetchSourceRecords(sourceTable), compareDate1);
                int sourceRecordsSize = sourceRecords.size();
                logger.info("Fetched {} {} record(s)", sourceRecords.size(), sourceTable);
                if (sourceRecordsSize > 0) {
                    int recordsInserted = copySourceRecords(targetDbTemplate, archiveObj.getTargetTable(),
                            archiveObj.getPrimaryKeyColumn(), sourceRecords, primaryKeyValueList);
                    if (recordsInserted > 0)
                        deleteSourceRecords(sourceDbTemplate, sourceTable, archiveObj.getPrimaryKeyColumn(),
                                primaryKeyValueList);
                }
            } catch (Exception e) {
                logger.error("Exception in archiveTableRecords: {} {}", e.getMessage(), e);
            }
        }
    
    
    public static String buildSQLQueryToFetchSourceRecords(String sourceTable) {
            StringBuilder sb = new StringBuilder("SELECT * FROM " + sourceTable + " where update_dts <= ?");
            return sb.toString();
        }
    
    
        public int copySourceRecords(JdbcTemplate targetDbTemplate, String targetTable, String primaryKeyColumn,
                List<Map<String, Object>> sourceRecords, List<Object> primaryKeyValueList) {
            int result = 0;
            logger.info("Copying records to {}", targetTable);
            int[][] insertResult = targetDbTemplate.batchUpdate(
                    ArchiveSQLQueries.buildSQLTargetRecordInsertionQuery(targetTable, sourceRecords.get(0),
                            primaryKeyColumn),
                    sourceRecords, batchSize, new ParameterizedPreparedStatementSetter<Map<String, Object>>() {
                        @Override
                        public void setValues(PreparedStatement ps, Map<String, Object> argument) throws SQLException {
                            int index = 1;
    
                            for (Entry<String, Object> obj : argument.entrySet()) {
                                if (obj.getKey().equals(primaryKeyColumn))
                                    primaryKeyValueList.add(obj.getValue());
                                else
                                    ps.setObject(index++, obj.getValue());
                            }
                        }
                    });
    
            result = getSumOfArray(insertResult);
            logger.info("Inserted {} record(s) in {}", result, targetTable);
            return result;
        }

I have tried above code when fetching the data somehow i’m getting error .

1 Answers1

0

Judging by method names, you copy from one table to another. Do it in batches using LIMIT and OFFSET. Simplified example:

SELECT *
FROM table
WHERE condition
ORDER BY column_which_ensures_last_added_data_is_fetched_last
LIMIT 5
OFFSET 0;

For the next batch change the offset using the number of processed records. Refer to the documentation for more examples. ORDER BY a column so you get oldest data first, newest last to ensure you won't get duplicates. If primary key is auto incremented id, it's good choice for this purpose.

If you need the count for logging purposes either use COUNT function, or calculate number of processed entities using count of batches.

SELECT COUNT(*)
FROM table
WHERE condition;

Algorithm pseudo code:

int processed = 0;
List<Map<String, Object>> list = //fetch using above query, replace offset with processed
while(!list.isEmpty()) {
  //copy contents of list to target
  //handle exceptions, data not inserted in target, etc.
  processed += list.size();
  list = //fetch using above query, replace offset with processed
}
Chaosfire
  • 4,818
  • 4
  • 8
  • 23
  • Hi @chaosfire im copying the data in batches , i have edited the code to show how im copying the data , the batch size is 500 . im getting the memory error when fetching the data itself. – QueenOfCode Oct 28 '22 at 10:01
  • 1
    Just remember that is records are being inserted between your queries you could see the same record twice. – tgdavies Oct 28 '22 at 11:16
  • @QueenOfCode You try to do the insert in target in batches, but load **everything** from the source in memory, which is the actual problem. This solution loads data from source in batches - load a batch from source, save batch in target, move on to next batch. You need to do it this way in order not to run out of memory. – Chaosfire Oct 28 '22 at 14:24
  • @tgdavies Good point, updated answer to take care of this case. – Chaosfire Oct 28 '22 at 14:30
  • @Chaosfire could you please show me in the code wise how this this can be handled. thanks in advance Sir – QueenOfCode Oct 31 '22 at 09:11
  • Can i use the setfetchsize function java.sql.Statement.setFetchSize(). st.setFetchSize(5000) – QueenOfCode Oct 31 '22 at 09:46
  • @QueenOfCode No, it does something else entirely - [What does Statement.setFetchSize(nSize) method really do in SQL Server JDBC driver?](https://stackoverflow.com/questions/1318354/what-does-statement-setfetchsizensize-method-really-do-in-sql-server-jdbc-driv). Probably you are looking for `setMaxRows()`, but that's not good idea, check [SQL LIMIT vs. JDBC Statement setMaxRows. Which one is better?](https://stackoverflow.com/questions/32707682/sql-limit-vs-jdbc-statement-setmaxrows-which-one-is-better) – Chaosfire Oct 31 '22 at 11:21