0

My Spring batch application consumes too many resources (+4 go Ram).

When I look at the jvm, the application creates 10 threads.

  • I use the partitioner to process file by file without scheduler
  • jobExecutionListener is used to stop the batch at the end of execution

    @Bean
    public Job mainJob() throws IOException {
    SimpleJobBuilder mainJob = this.jobBuilderFactory.get("mainJob")
                .start(previousStep())    
                .next(partitionStep())
                .next(finalStep())
                .listener(jobExecutionListener(taskExecutor()));;
        return mainJob.build();
    }
    
    @Bean
    public Step partitionStep() throws IOException {
        Step mainStep = stepBuilderFactory.get("mainStep")
                .<InOut, InOut>chunk(1)
                .reader(ResourceReader())
                .processor(processor())
                .writer(itemWriter())
                .build();
    
        return this.stepBuilderFactory.get("partitionStep")
                .partitioner(mainStep)
                .partitioner("mainStep", partitioner())
                .build();
    }
    
    @Bean(name = "taskExecutor")
    public ThreadPoolTaskExecutor taskExecutor() {
        ThreadPoolTaskExecutor taskExecutor = new ThreadPoolTaskExecutor();
        taskExecutor.setCorePoolSize(1);
        taskExecutor.setMaxPoolSize(1);
        taskExecutor.setQueueCapacity(1);
        taskExecutor.setThreadNamePrefix("MyBatch-");
        taskExecutor.initialize();
    
        return taskExecutor;
    }
    
    //This jobExecutionListener stop the batch
    @Bean
    public JobExecutionListener jobExecutionListener(@Qualifier("taskExecutor") 
    ThreadPoolTaskExecutor executor) {
        return new JobExecutionListener() {
            private ThreadPoolTaskExecutor taskExecutor = executor;
            @Override
            public void beforeJob(JobExecution jobExecution) {
            }
    
            @Override
            public void afterJob(JobExecution jobExecution) {
                taskExecutor.shutdown();
                System.exit(0);
            }
        };
    }
    
    @Bean
    public Partitioner partitioner() {
        MultiResourcePartitioner partitioner = new MultiResourcePartitioner();
        ResourcePatternResolver patternResolver = new 
        PathMatchingResourcePatternResolver();
    
        try {
            partitioner.setResources(patternResolver.getResources(FILE + 
        configProperties.getIn()+ "/*.xml"));
        } catch (IOException e) {
            throw new RuntimeException("I/O problems when resolving the input file pattern.",e);
        }
        partitioner.setKeyName("file");
        return partitioner;
    }
    

How can I apply my application in monothread ? The taskexecutor doesn't work.

Natacha
  • 57
  • 2
  • 8

1 Answers1

1

Your app creates 10 threads but those are not necessarily Spring Batch threads. According to your config, only one thread with prefix MyBatch- should be created.

Moreover, you declared a task executor as a bean but you did not set it on the partitioned step. Your partitionStep should be something like:

@Bean
public Step partitionStep() throws IOException {
   Step mainStep = stepBuilderFactory.get("mainStep")
        .<InOut, InOut>chunk(1)
        .reader(ResourceReader())
        .processor(processor())
        .writer(itemWriter())
        .build();

   return this.stepBuilderFactory.get("partitionStep")
        .step(mainStep) // instead of .partitioner(mainStep)
        .partitioner("mainStep", partitioner())
        .taskExecutor(taskExecutor())
        .build();
}

How can I apply my application in monothread ? The taskexecutor doesn't work.

After setting the task executor on the partitioned step, you should see this step being executed by the sole thread as defined in your ThreadPoolTaskExecutor. However, I don't see the benefit of using a single thread for a partitioned step, because the usual goal for such a setup is to process partitions in parallel (either locally with multiple threads or remotely with multiple worker JVMs).

As a side note, it's good that you shutdown the task executor with a Job listener in afterJob, but don't System.exit. You need to let the JVM shutdown gracefully.

Hope this helps.

Mahmoud Ben Hassine
  • 28,519
  • 3
  • 32
  • 50
  • Thank you Mahmoud, I forgot the .listener(jobExecutionListener(taskExecutor ())) in the main job to call the ThreadPoolTaskExecutor (in my code). I edited my code. I tested your method but it doesn't work, the batch always uses 4g of ram. I see my thread (MyBatch-) in the jvm but it's too fast to be monothread. I'm using the system.exit to close the window after running. – Natacha Mar 06 '19 at 14:49
  • My answer does not address memory consumption (and it won't improve it). Your question was about threading. With my answer, your partitioned step should use 1 thread (the one from the thread pool). For memory consumption, you need to specify more details about your app in the question: how much data it loads in memory, how many steps are run in parallel, etc. – Mahmoud Ben Hassine Mar 06 '19 at 14:56