I'm working in Spring Batch project. For a batch execution, we got unexpected results in database after execution. In this batch, we just fetch records from database Table1 (Reader), convert some values from a t(Processor) and insert into database Table2 (Writer). We are facing issue as fetching records and inserting records are not matched, and we find differences.
Below are the XML configuration of commit-interval and Page Size.
XML Configuration:
<batch:step id="step2">
<batch:tasklet>
<batch:listeners>
<batch:listener ref="errorListener"></batch:listener>
<batch:listener ref="noDataFoundListener"></batch:listener>
</batch:listeners>
<batch:chunk reader="reader" processor="Batch1processor" writer="writer"
commit-interval="100000"></batch:chunk>
</batch:tasklet>
</batch:step>
<bean id="Batch1reader"
class="org.springframework.batch.item.database.JpaPagingItemReader"
scope="step">
<property name="entityManagerFactory" ref="entityManagerFactory"></property>
<property name="queryProvider" ref="readerQuery"></property>
<property name="pageSize" value="100000"/>
</bean>
<bean id="readerQuery"
class="org.springframework.batch.item.database.orm.JpaNativeQueryProvider"
scope="step">
<property name="entityClass"value="com.project.abd.Batch1InputBean"></property>
<property name="sqlQuery"
value=
" ~SQL_QUERY which returns values around 2,5 Lakhs ">
</property>
</bean>
<bean id="Batch1processor" class="com.project.abd.processor.Batch1processor" scope="step"/>
Processor:
public class Batch1processor implements ItemProcessor<Batch1InputBean, Batch1OutputEntity>
{
@Override
public Batch1OutputEntity process(Batch1InputBean inputBean) throws Exception
{
..........
return Batch1OutputEntity object;
}
}
Here what problem we facing is
If we kept Commit-Interval and Page-Size as 1 million, then 250k records will process in single transaction. Batch performance also improved but afraid that a heap space issue could occur in future.
If we kept Commit-Interval and Page-Size as 100k, then 250k records will process in three transaction (100k record per transaction). Performance improved but
- a. Most of the time batch executed and output result is right and match our expectation.
- But few times, wrong data is stored into database ie) between transactions there is some malfunction occurred and records inserted in different manner.
If we kept Commit-Interval and Page-Size as 10K, then we face issues on most of the execution. reader fetching records are not matched with writer inserting data.
How to sort out this issue? is there any possibility to implement dynamic Commit_interval / Page-Size to avoid this issue.
Edit : Added query
SELECT rownum AS num, A.*,B.MANUFACTURE_WEEK_NO,B.WEEK_PRODUCTION_YEAR,B.WEEK_START_DATE,
C.VALUE24, NVL((D.VALUE25), '') as VALUE25,table7.VALUE25
FROM
(SELECT
TABLE1.VALUE1,TABLE1.VALUE2,TABLE1.VALUE3,TABLE1.VALUE4,
TABLE1.VALUE5,TABLE1.VALUE6,TABLE1.VALUE7,TABLE3.VALUE8,
TABLE4.VAlUE9,TABLE4.VAlUE10,TABLE2.VALUE11,TABLE2.MANUFACTURE_ORDER_MNTH ,
TABLE2.VALUE13,TABLE2.VALUE14,TABLE2.VALUE15,TABLE2.VALUE16,
TABLE2.VALUE17,TABLE2.VALUE18,TABLE2.VALUE19,
TABLE2.VALUE20,TABLE2.VALUE21,TABLE2.VALUE22,TABLE2.VALUE23, 1 AS VALUE_TYPE
FROM TABLE1,TABLE2,TABLE3,TABLE4
WHERE TABLE2.VALUE11 = '100'
AND TABLE4.VALUE14= TABLE2.VALUE14
AND TABLE3.OEI_BUYER_ID= TABLE4.OEI_BUYER_ID
AND TABLE1.OEI_SPEC_ID= TABLE3.OEI_SPEC_ID
AND TABLE1.VALUE2= '00'
AND TABLE2.MANUFACTURE_ORDER_MNTH = 'NOV-22'
AND ((TABLE2.VALUE13>= TABLE2.MANUFACTURE_ORDER_MNTH )
OR (TABLE2.VALUE13= TABLE2.MANUFACTURE_ORDER_MNTH
AND trim(TABLE1.VALUE7) LIKE '%D'))
AND NVL(TABLE2.VALUE23, ' ') NOT IN '-'
) A
INNER JOIN
(SELECT VALUE13,MANUFACTURE_WEEK_NO,WEEK_START_DATE,WEEK_PRODUCTION_YEAR,ROw_number() OVER (partition BY VALUE13 ORDER BY WEEK_START_DATE DESC) AS RNO
FROM WEEK_MASTER_TABLE WHERE WEEK_MASTER_TABLE.VALUE11 = '100'
) B ON A.VALUE13=B.VALUE13 AND B.RNO=1
LEFT JOIN TABLE5 D
ON D.VALUE11 ='100' AND D.VALUE14= A.VALUE14 AND
D.MANUFACTURE_ORDER_MNTH =A.MANUFACTURE_ORDER_MNTH AND D.VALUE13= A.VALUE13 AND D.MANUFACTURE_WEEK_NO= B.MANUFACTURE_WEEK_NO
LEFT JOIN
(SELECT VALUE14,NVL(MAX(table6.VALUE24), '') AS VALUE24 FROM TABLE6 table6
WHERE table6.VALUE11 = '100' GROUP BY VALUE14 ) C ON C.VALUE14 = A.VALUE14
LEFT JOIN TABLE7 table7
ON table7.VALUE11 ='100' AND table7.VALUE14 = A.VALUE14
AND table7.MANUFACTURE_ORDER_MNTH <= A.MANUFACTURE_ORDER_MNTH AND table7.VALUE13 = A.VALUE13