0

Question:

I'm experiencing an issue with the RepositoryItemReader in my Spring Batch job where it is skipping items during the read process. I have configured the RepositoryItemReader with a specific paging size and aligned it with the chunk size of the step. However, some items are being skipped, and I'm unable to determine the root cause of the problem.

I read the answer here before i write the question , but I can't find the solution for my case .

Here are the relevant details of my setup:

I'm using Spring Batch v5.

The RepositoryItemReader is reading data from a repository using the findByIsEnabled method.

I have set the pageSize of the RepositoryItemReader to the same value as the chunk size in the step definition. I have verified that there are no filters or conditions applied to the RepositoryItemReader that could exclude items.

The repository used by the RepositoryItemReader is transactional, and the methods called by the reader are executed within the same transaction. I have checked the data source for any inconsistencies or missing items that could cause skipping, but everything seems to be in order.

nearly half of pages are skipped (4961/1000) items ratio

enter image description here

Update: I have created a simple demo application to reproduce the issue. You can find the code on my GitHub repository: GitHub Repository Link

The example at the top works with the h2 embedded database, you can just run it and you can see the result.

I suspect there might be an issue with the pagination logic or some misconfiguration that causes the reader to skip the next page. I have reviewed the Spring Batch documentation and tried various configurations, but I couldn't identify the root cause of this behavior.

Could someone please help me identify potential causes for this skipping behavior with the RepositoryItemReader? Any suggestions or insights would be greatly appreciated.

omar
  • 31
  • 5

1 Answers1

1

At first glance the problem in your code is that in the ItemProcessor you are modifying the column enabled which is also used as a filter in the reader. This can't work with a paging reader. Every chunk will get a different set or rows due to the change made in the processor.

Solution

Don't change the column enabled in the processor. Use a different column like processed and after the batch is completed run an SQL to update the enabled column.

Mar-Z
  • 2,660
  • 2
  • 4
  • 16
  • hello sir , i tried your idea and i am making a commit in the repo with your proposal [**GitHub Repository Link**](https://github.com/omarabdennour/batch.git) , here is the result https://i.stack.imgur.com/sdkAu.png . looks good, but something is missing, thanks for the help i appreciate it. – omar Jun 24 '23 at 11:34
  • Your observation about the enabled column being modified in the ItemProcessor and causing inconsistencies with the paging reader was spot on. After carefully considering your suggestion, I realized that modifying the value used for filtering in the repository within the ItemProcessor is indeed causing the issue. – omar Jun 24 '23 at 12:31
  • 1
    Cool. Thanks for feedback. – Mar-Z Jun 24 '23 at 12:33
  • Based on your guidance, I made the necessary adjustments to ensure that the enabled column is not modified in the ItemProcessor. Instead, I moved the modification logic to a separate step in my Spring Batch job. This has resolved the problem and provided consistent results during the read process. I sincerely appreciate your expertise and guidance in identifying the root cause of the issue. Your assistance has been invaluable in helping me improve the reliability and performance of my Spring Batch job. – omar Jun 24 '23 at 12:34