0

I have a few milions of records and I migrate it from one Oracle DB to another. We have performance problem and after discussing with my colleagues I decided to process data multithread.

We have the following artifacts:

  • Spring boot 2.1.6.RELEASE
  • HikariCP 3.2.0
  • Hibernate 5.3.10.Final
  • JDK 11.0.2
  • OJDBC8 12.2.0.1

I have Service class annotated @Service and inside the class is method


    @Async("threadPoolTaskExecutor")
    @Transactional(propagation = Propagation.REQUIRES_NEW)
    public void processMigration(int from, int to) {

        int progressInterval = to - from;
        ProgressBar progressBar = new ProgressBar("Progress " + Thread.currentThread().getName(), progressInterval, ProgressBarStyle.UNICODE_BLOCK);
        try (Stream<SomeEntity> entityStream = someEntityRepository.streamAllInInterval(from, to)) {

            progressBar.start();
            entityStream.forEach(entity -> progressBar.step());
            progressBar.stop();
        } catch (Exception e) {
            progressBar.stop();
            throw e;
        }
    }

inside the foreach there will be some logic for processing data and this service is injected into another class call it Migrator (containing injected ThreadPoolTaskExecutor) with the following method:

public void migrate() { 

crimeService.processMigration(0, 500000);
crimeService.processMigration(500000, 1000000);
}

and my SpringBootApplication class (main configuration):

@SpringBootApplication
@EnableAsync
@EnableTransactionManagement
public class MigrationApplication {

    public static void main(String[] args) {
        SpringApplication.run(MigrationApplication.class, args);
    }

    @Bean("threadPoolTaskExecutor")
    public TaskExecutor taskExecutor() {
        ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();
        executor.setCorePoolSize(10);
        executor.setMaxPoolSize(25);
        executor.setQueueCapacity(30);
        executor.afterPropertiesSet();
        return executor;
    }
}

application.yaml looks like this:

spring:
  jpa:
    properties:
      hibernate:
        jdbc:
          batch_size: 100
          fetch_size: 400
      dialect: org.hibernate.dialect.Oracle12cDialect
      order_inserts: true
      order_updates: true
  datasource:
    url: jdbc:oracle:thin::1521:something
    username: username
    password: password
    hikari:
      maximum-pool-size: 10
      leak-detection-threshold: 30000
    driver-class-name: oracle.jdbc.OracleDriver

I supposed that when I ran the mentioned code then Hikari will create 2 connections, because I have called twice the method processMigration() annotated with @Transactional. I saw in log that it created only at the begining, but when the one of the threads waiting to another then there was only one connection as active. My computer has available 4 cores, so I would expect that with the HW problem does not exist. I understand why there is only one active connection because there is only one thread running, but why the second thread waiting I cannot figure out. Please help and If someone have better approach how to migrate data than I chose, I appriciate your suggestion.

UPDATE

I found out that the JpaTransactionManager creates the transaction for both threads, but one transaction is committed immediately at the begining. When the second thread had finished the task, it has not finished the previous task.

2019-08-15 15:02:39.707 DEBUG 3939 --- [           main] o.s.orm.jpa.JpaTransactionManager        : Creating new transaction with name [org.springframework.data.jpa.repository.support.SimpleJpaRepository.count]: PROPAGATION_REQUIRED,ISOLATION_DEFAULT,readOnly
2019-08-15 15:02:39.707 DEBUG 3939 --- [           main] o.s.orm.jpa.JpaTransactionManager        : Opened new EntityManager [SessionImpl(1773290233<open>)] for JPA transaction
2019-08-15 15:02:39.717 DEBUG 3939 --- [           main] o.s.orm.jpa.JpaTransactionManager        : Exposing JPA transaction as JDBC [org.springframework.orm.jpa.vendor.HibernateJpaDialect$HibernateConnectionHandle@698ef9d1]
2019-08-15 15:02:40.156 DEBUG 3939 --- [           main] o.s.orm.jpa.JpaTransactionManager        : Initiating transaction commit
2019-08-15 15:02:40.157 DEBUG 3939 --- [           main] o.s.orm.jpa.JpaTransactionManager        : Committing JPA transaction on EntityManager [SessionImpl(1773290233<open>)]
2019-08-15 15:02:40.163 DEBUG 3939 --- [           main] o.s.orm.jpa.JpaTransactionManager        : Closing JPA EntityManager [SessionImpl(1773290233<open>)] after transaction
Migrating 1799449 CRIMES

2019-08-15 15:02:40.186 DEBUG 3939 --- [lTaskExecutor-2] o.s.orm.jpa.JpaTransactionManager        : Creating new transaction with name [com.aliter.mvsmigration.dvs.service.CrimeServiceImpl.processMigration]: PROPAGATION_REQUIRES_NEW,ISOLATION_DEFAULT
2019-08-15 15:02:40.186 DEBUG 3939 --- [lTaskExecutor-1] o.s.orm.jpa.JpaTransactionManager        : Creating new transaction with name [com.aliter.mvsmigration.dvs.service.CrimeServiceImpl.processMigration]: PROPAGATION_REQUIRES_NEW,ISOLATION_DEFAULT
TASK duration: 830ms
2019-08-15 15:02:40.187 DEBUG 3939 --- [lTaskExecutor-2] o.s.orm.jpa.JpaTransactionManager        : Opened new EntityManager [SessionImpl(346995150<open>)] for JPA transaction
2019-08-15 15:02:40.187 DEBUG 3939 --- [lTaskExecutor-1] o.s.orm.jpa.JpaTransactionManager        : Opened new EntityManager [SessionImpl(1848267920<open>)] for JPA transaction
2019-08-15 15:02:40.187 DEBUG 3939 --- [lTaskExecutor-2] o.s.orm.jpa.JpaTransactionManager        : Exposing JPA transaction as JDBC [org.springframework.orm.jpa.vendor.HibernateJpaDialect$HibernateConnectionHandle@1a69552b]
2019-08-15 15:02:40.188 DEBUG 3939 --- [           main] o.s.orm.jpa.JpaTransactionManager        : Creating new transaction with name [org.springframework.data.jpa.repository.support.SimpleJpaRepository.save]: PROPAGATION_REQUIRED,ISOLATION_DEFAULT
2019-08-15 15:02:40.189 DEBUG 3939 --- [           main] o.s.orm.jpa.JpaTransactionManager        : Opened new EntityManager [SessionImpl(282270616<open>)] for JPA transaction
2019-08-15 15:02:40.191 DEBUG 3939 --- [lTaskExecutor-1] o.s.orm.jpa.JpaTransactionManager        : Exposing JPA transaction as JDBC [org.springframework.orm.jpa.vendor.HibernateJpaDialect$HibernateConnectionHandle@22c0dffa]
2019-08-15 15:02:40.192 DEBUG 3939 --- [           main] o.s.orm.jpa.JpaTransactionManager        : Exposing JPA transaction as JDBC [org.springframework.orm.jpa.vendor.HibernateJpaDialect$HibernateConnectionHandle@817e0aa]
2019-08-15 15:02:40.212 DEBUG 3939 --- [           main] o.s.orm.jpa.JpaTransactionManager        : Initiating transaction commit
2019-08-15 15:02:40.214 DEBUG 3939 --- [           main] o.s.orm.jpa.JpaTransactionManager        : Committing JPA transaction on EntityManager [SessionImpl(282270616<open>)]
2019-08-15 15:02:40.229 DEBUG 3939 --- [           main] o.s.orm.jpa.JpaTransactionManager        : Closing JPA EntityManager [SessionImpl(282270616<open>)] after transaction
PROGRAM duration: 1507ms
mvs-migration-shell:>2019-08-15 15:02:40.389  WARN 3939 --- [lTaskExecutor-2] org.jline                                : Unable to create a system terminal, creating a dumb terminal (enable debug logging for more information)
2019-08-15 15:02:40.389  WARN 3939 --- [lTaskExecutor-1] org.jline                                : Unable to create a system terminal, creating a dumb terminal (enable debug logging for more information)
Progress MVSThreadPoolTaskExecutor-2   0% │          │    0/1000 (0:00:00 / ?)
2019-08-15 15:02:41.820 DEBUG 3939 --- [lTaskExecutor-2] o.s.orm.jpa.JpaTransactionManager        : Initiating transaction commit
2019-08-15 15:02:41.821 DEBUG 3939 --- [lTaskExecutor-2] o.s.orm.jpa.JpaTransactionManager        : Committing JPA transaction on EntityManager [SessionImpl(346995150<open>)]
2019-08-15 15:02:41.825 DEBUG 3939 --- [lTaskExecutor-2] o.s.orm.jpa.JpaTransactionManager        : Closing JPA EntityManager [SessionImpl(346995150<open>)] after transaction
Progress MVSThreadPoolTaskExecutor-1 100% │██████████│ 1000/1000 (0:00:08 / 0:
2019-08-15 15:02:49.084 DEBUG 3939 --- [lTaskExecutor-1] o.s.orm.jpa.JpaTransactionManager        : Initiating transaction commit
2019-08-15 15:02:49.085 DEBUG 3939 --- [lTaskExecutor-1] o.s.orm.jpa.JpaTransactionManager        : Committing JPA transaction on EntityManager [SessionImpl(1848267920<open>)]
2019-08-15 15:02:49.585 DEBUG 3939 --- [lTaskExecutor-1] o.s.orm.jpa.JpaTransactionManager        : Closing JPA EntityManager [SessionImpl(1848267920<open>)] after transaction
2019-08-15 15:03:02.981 DEBUG 3939 --- [l-1 housekeeper] com.zaxxer.hikari.pool.HikariPool        : HikariPool-1 - Pool stats (total=10, active=0, idle=10, waiting=0)
2019-08-15 15:03:32.984 DEBUG 3939 --- [l-1 housekeeper] com.zaxxer.hikari.pool.HikariPool        : HikariPool-1 - Pool stats (total=10, active=0, idle=10, waiting=0)
2019-08-15 15:04:02.989 DEBUG 3939 --- [l-1 housekeeper] com.zaxxer.hikari.pool.HikariPool        : HikariPool-1 - Pool stats (total=10, active=0, idle=10, waiting=0)
2019-08-15 15:04:32.995 DEBUG 3939 --- [l-1 housekeeper] com.zaxxer.hikari.pool.HikariPool        : HikariPool-1 - Pool stats (total=10, active=0, idle=10, waiting=0)

Peter S.
  • 470
  • 1
  • 7
  • 17
  • I don't understand why you create a Runnable. just call the method directly – Simon Martinelli Aug 15 '19 at 07:04
  • I would say that what you said is probably missunderstanding concept of parallel processing with Spring, please check this [link](https://stackoverflow.com/questions/852743/any-good-spring-threading-with-a-taskexecutor-examples) – Peter S. Aug 15 '19 at 08:35
  • No I think the misunderstanding is on your side: https://www.baeldung.com/spring-async – Simon Martinelli Aug 15 '19 at 08:41
  • I took a look at your link and you are right it is useless to create Runnable and put it again into the task executor, it is ensured with annotations. Unfortunately the problem staying, the one thread waiting to another thread for completing task. :( but thank you – Peter S. Aug 15 '19 at 09:18
  • What is MvsMigrationApplication? Why don't you start MigrationApplication? – JB Nizet Aug 15 '19 at 09:33
  • MvsMigrationApplication is main spring boot configuration class. I run the application through that class. – Peter S. Aug 15 '19 at 09:36
  • Then why are all the annotations and the bean declaration not in that class rather than in MigrationApplication? – JB Nizet Aug 15 '19 at 09:40
  • Aaah, sorry, firstly I did not understand what you mean, but this is my mistake, when I created this topic, because my main configuration class is named as MvsMigrationApplication in my source code, but i changed it to MigrationApplication for better readability, I will fix it, Thanks ! – Peter S. Aug 15 '19 at 09:47
  • you need to call initialize() method of the executor. – incrediblehulk Aug 15 '19 at 10:15
  • Yes, during my own investigation I found out that in the afterPropertiesSet() is called initialize() method, so I add it into my code but problem is still here. :( – Peter S. Aug 15 '19 at 10:23
  • Can you debug the application and check what the class in injected as `crimeService` does it have proper decoration for ASync, it should be some generated wrapper class around your own class. – Martin van Wingerden Aug 15 '19 at 12:10
  • There are some CglibAopProxies and inside are advisors **AsyncAnnotationAdvisor** and **BeanFactoryTransactionAttributeSourceAdvisor** – Peter S. Aug 15 '19 at 12:31
  • Had your team considered using Spring Batch or an ETL tool – eHayik Aug 15 '19 at 13:59
  • I did some research and found out that Spring Batch exists. From my point of view we have not much time to rework the project. It is already implemented approximately "90%" (yes, hard to say because big performance problem occured), so I would like to use this approach, which I see as less invasive. – Peter S. Aug 15 '19 at 15:28

1 Answers1

0

I found out where the problem is, in the mentioned method processMigration() there is the method streamAllInInterval(from, to), which returns the data in specified interval. It contains native SQL query:

Select f.* from someView f left join someTable em on em.id = f.utvar where em.id is null AND ROWNUM BETWEEN :from AND :to

This query returns data when I set from 0 and to 500000 but not for 500000 and 1000000. So I rewrote the query with the first mentioned answer in here SQL ROWNUM how to return rows between a specific range and now everything is working.

Peter S.
  • 470
  • 1
  • 7
  • 17