0

I have a merge into statement as this :

   private static final String UPSERT_STATEMENT = "MERGE INTO " + TABLE_NAME + " tbl1 " +
    "USING (SELECT ? as KEY,? as DATA,? as LAST_MODIFIED_DATE FROM dual) tbl2 " +
    "ON (tbl1.KEY= tbl2.KEY) " +
    "WHEN MATCHED THEN UPDATE SET DATA = tbl2.DATA, LAST_MODIFIED_DATE = tbl2.LAST_MODIFIED_DATE " +
    "WHEN NOT MATCHED THEN " +
    "INSERT (DETAILS,KEY, DATA, CREATION_DATE, LAST_MODIFIED_DATE) " +
    "VALUES (SEQ.NEXTVAL,tbl2.KEY, tbl2.DATA, tbl2.LAST_MODIFIED_DATE,tbl2.LAST_MODIFIED_DATE)";

This is the execution method:

 public void mergeInto(final JavaRDD<Tuple2<Long, String>> rows) {
    if (rows != null && !rows.isEmpty()) {
        rows.foreachPartition((Iterator<Tuple2<Long, String>> iterator) -> {

            JdbcTemplate jdbcTemplate = jdbcTemplateFactory.getJdbcTemplate();
            LobCreator lobCreator = new DefaultLobHandler().getLobCreator();

            while (iterator.hasNext()) {
                Tuple2<Long, String> row = iterator.next();
                String details = row._2();
                Long key = row._1();

                java.sql.Date lastModifiedDate = Date.valueOf(LocalDate.now());
                Boolean isSuccess = jdbcTemplate.execute(UPSERT_STATEMENT, (PreparedStatementCallback<Boolean>) ps -> {

                    ps.setLong(1, key);
                    lobCreator.setBlobAsBytes(ps, 2, details.getBytes());
                    ps.setObject(3, lastModifiedDate);
                    return ps.execute();
                });
                System.out.println(row + "_" + isSuccess);

            }
        });
    }
}

I need to upsert multiple of this statement inside of PLSQL, bulks of 10K if possible.

  1. what is the efficient way to save time : execute 10K statements at once, or how to execute 10K statements in the same transaction?
  2. how should I change the method for support it?

Thanks, Me

userit1985
  • 961
  • 1
  • 13
  • 28

1 Answers1

1
  1. the most efficient way would be one that bulk-loads your data into the database. In comparison to one-by-one uploads (as in your example), I'd expect performance gains of at least 1 or 2 orders of magnitude ("bigger" data means less to be gained by bulk-inserting).

  2. you could use a technique as described in this answer to bulk-insert your records into a temporary table first and then perform a single merge statement using the temporary table.

HAL 9000
  • 3,877
  • 1
  • 23
  • 29
  • I saw there is a way for bulk-insert using batchUpdate as describe in https://stackoverflow.com/questions/9565481/how-to-do-multiple-inserts-in-database-using-spring-jdbc-template-batch What are the advantages for each method? – userit1985 Jun 01 '17 at 09:45
  • You'll have to measure/analyze it. They might be the same or very different. But at least for Oracle the one using addBatch() should use one roundtrip *per batch* (true bulk load) vs one roundtrip per row. – HAL 9000 Jun 01 '17 at 11:24