0

I have implemented Spring batch local step partitioning in Windows with Grid size 15 and corepoolsize and maxPoolsize as 10, so when i execute it 10 threads are executing parallel( data size is 1 Million records and it completed in 50 seconds, 8 GB RAM Configuration).

I wanted to execute with more data so we executed the jar on linux with 10 million data with same configuration(grid size 15, pool size 10), but started with only one thread, then after sometime it started two other threads and so on. The linux machine is with server configuration like more than 100 gb RAM(data size 10 million and it took about 16 minutes to complete,i feel it is very very slow). Ideally 10 threads should run parallel based on my configuration, i am confused.

The xml configuration is:

<batch:step id="step6">
    <batch:partition step="loadFlatFiles" partitioner="multiFileResourcePartitioner">
                <batch:handler grid-size="15" task-executor="loadCustomerTaskExecutor" />
    </batch:partition>
</batch:step>

<bean id="loadCustomerTaskExecutor" class="org.springframework.scheduling.concurrent.ThreadPoolTaskExecutor">
<property name="corePoolSize" value="10" />
<property name="maxPoolSize" value="10" />
<property name="allowCoreThreadTimeOut" value="true" />
</bean>

<batch:step id="loadFlatFiles">
  <batch:tasklet>
      <batch:chunk reader="masterFileItemReader" writer="masterFileWriter" processor="itemProcessor" commit-interval="5000" skip-limit="1000000" >
          <batch:skippable-exception-classes>
            <batch:include class="org.springframework.batch.item.file.FlatFileParseException"/>
          </batch:skippable-exception-classes>
          <batch:listeners>
            <batch:listener ref="recordSkipListener"/>
        </batch:listeners>
    </batch:chunk>
  </batch:tasklet>
</batch:step>

<bean id="recordSkipListener" class="com.cdi.batch.listener.RecordSkipListener" scope="step">
</bean>

<bean id="multiFileResourcePartitioner" class="com.cdi.batch.partitioner.MultiFileResourcePartitioner"
scope="step">
<property name="keyName" value="fileResource" />
<property name="fileName" value="fileName" />
<property name="directory" value="file:${input.files.location}" />
</bean>

Is any one faced the same issue and i would like to know why its behaving like this?

Update: For storing the job related meta data i am using in memory

<!-- stored job-meta in memory -->
<bean id="jobRepository"
    class="org.springframework.batch.core.repository.support.MapJobRepositoryFactoryBean">
    <property name="transactionManager" ref="transactionManager" />
</bean>

The code was implemented using Java 6 and Spring batch 3.

Braiam
  • 1
  • 11
  • 47
  • 78
Shankar
  • 8,529
  • 26
  • 90
  • 159
  • It seems a problem of the Linux JVM or some configuration. What is the version of JVM used to run the batch on your server? – ElPysCampeador Feb 18 '15 at 10:53
  • 1
    This link should be useful to understand why threads are few and slow on starting http://stackoverflow.com/questions/7726871/java-virtual-machine-maximum-number-of-threads – ElPysCampeador Feb 18 '15 at 11:00
  • Thanks for the link.. i will look into this... BTW i am using JDK 6 on linux. – Shankar Feb 19 '15 at 09:03
  • 1
    While googling I found that on Linux a max thread limit could be set (on OS level). You could try to find out the value of this limit in your server and ask for more. – ElPysCampeador Feb 20 '15 at 10:53

0 Answers0