1

I have created a spring boot application which will take job requests and run them in background. These job requests are so intensive that they gonna take 4-5 hours if they are processed by single thread. Internally these job requests have separate smaller tasks which are around 300-400. So I have created a task executor to process them in parallel. It worked like charm and finished everything in 35 minutes. But problem came when another job is running parallel to this job. Now it is taking 2 hours for same job. Initially, I thought may be one job is taking all threads and making other job wait. So in order to solve this I have created another executor and assigned them to each job. But no improvement.

By the way, the internal tasks are internally calling databases.

Below is the configuration of task executors and how I am using on methods.

    @Bean(name = "taskExecutor")
    public Executor threadPoolTaskExecutor() {
        ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();
        executor.setCorePoolSize(100);
        executor.setMaxPoolSize(200);
        executor.setQueueCapacity(200);
        executor.setThreadNamePrefix("Thread1-");
        executor.initialize();
        return executor;
    }

    @Bean(name = "exTaskExecutor")
    public Executor exThreadPoolTaskExecutor() {
        ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();
        executor.setCorePoolSize(20);
        executor.setMaxPoolSize(30);
        executor.setQueueCapacity(100);
        executor.setThreadNamePrefix("Thread2-");
        executor.initialize();
        return executor;
    }

    @Async("taskExecutor")
    public void job1()
        //do something  
    }

    @Async("exTaskExecutor")
    public void job2()
        //do something  
    }

   //database connections
   spring.datasource.hikari.connectionTimeout=60000
   spring.datasource.hikari.idleTimeout=600000
   spring.datasource.hikari.maxLifetime=1800000 
   spring.datasource.hikari.autoCommit=true
   spring.datasource.hikari.maximumPoolSize=120
   spring.datasource.hikari.connection-test-query: SELECT 1 FROM DUAL

I am not getting where the problem is? Is it in task executors or HikariCP? All I can see from logs is threads from two executors are not running in parallel at any point of time. Any help or alternative way is highly appreciated.

karepu
  • 198
  • 1
  • 6

2 Answers2

1

The main problem here is available number of processors you have, you can only run n threads parallely (n == availableProcessors) equal to available no of processors and remaining will run concurrently, for example you can check by using Runtime class to check available processors

Runtime.getRuntime().availableProcessors() // In my case 8

I do have total 4 cores Hyper-Threading processor where each core can process two threads parallely and reaming threads will run concurrently, you can find the difference between parallel and concurrent and also you can more information here Java threads and number of cores

Ryuzaki L
  • 37,302
  • 12
  • 68
  • 98
  • Thanks for the reply Deadpool. I have gone through all these documents before. I forgot to mention that we have used simple threads by keeping Thread.sleep for few seconds to run these internal jobs (without executors) and connected to oracle database without connection pool and it completed both jobs in 1.5 hours. But with executors, as i said it is taking lot of time. I don't want to go with simple thread mechanism because it is old fashioned and takes lot of resources. Also the metrics which I have given came from same machine which is a windows server with 2 cores. – karepu Nov 15 '19 at 07:09
  • You are creating threads as needed before without `executors` and now you are creating 100's of threads and re using them, when you create a thread pool thread scheduler assign equal resource (CPU) to all the threads when they are created. I would say to keep these number lower as much as possible @karepu – Ryuzaki L Nov 15 '19 at 16:34
0

As @Deadpool said, you need to understand where your process is constrained. You may need to scale up (ie a machine with more cpu) or scale out across multiple machines. Scaling out will probably require something like Akka, Zookeeper, Docker Swarm, Kubernetes or some other scalable work manager.

MarkOfHall
  • 3,334
  • 1
  • 26
  • 30
  • This system will be used by 3 people at most and they submit 2 jobs at any point of time. That's why we don't want to scale resources. Also i gave detailed explanation on @Deadpool answer as a comment of what is my benchmark is. But thanks for the answer. – karepu Nov 15 '19 at 07:13