0

I saw this spring batch not processing all records

I have the same setup where-in I read records:

Example with page size = 2 and 6 lines in db

A IS_UPDATED=true
B IS_UPDATED=true
C IS_UPDATED=true
D IS_UPDATED=true
E IS_UPDATED=true
F IS_UPDATED=true

The problem is probably mixing pagination and updating the reader query's criteria (IS_UPDATED).

Example with page size = 2 and 6 lines in db

A IS_UPDATED=true
B IS_UPDATED=true
C IS_UPDATED=true
D IS_UPDATED=true
E IS_UPDATED=true
F IS_UPDATED=true

First read page = 1 return lines A and B

After writer execution (set IS_UPDATED to false for A & B), we have in db :

C IS_UPDATED=true
D IS_UPDATED=true
E IS_UPDATED=true
F IS_UPDATED=true

Second read will move to page 2 so it will take lines E & F and not C & D

I was thinking of what alternative approach for this since IS_UPDATED = true is my itemReader criteria and identifier. If I leave IS_UPDATED to true, then it will always be included on my daily batch app.

Addendum: I am using this for multi-threading

@Bean
public TaskExecutor taskExecutor() {
    return new SimpleAsyncTaskExecutor("taskName");
}
Ooze
  • 35
  • 5

1 Answers1

1

You can't use the RepositoryItemReader in this case because you are changing the parameter used as a filter. The pagination won't work.

Simply use JdbcCursorItemReader instead.

Mar-Z
  • 2,660
  • 2
  • 4
  • 16
  • Thanks Mar-Z. Is there a way for me to optimize my identifier, the reason why I am tagging the records is that I don't want to reprocess the items on when the 2nd time the batch is triggered For Example I have 20 records with IS_UPDATE column in DB, say 10 records value is null, while the rest is 'true'. on the first run I would only want to execute values that is true, leaving the 10 records untouched. then on the 2nd run, another app will update the null to true, so that the next job will process the 10 left records. – Ooze Jun 09 '23 at 17:42
  • Hi Mar-Z, since I am using as well a multi-thread implementation (code added in the original question), is JdbcCursorItemReader still safe for this? Or will JdbcPagingItemReader be a better fit? Thanks! – Ooze Jun 11 '23 at 12:30
  • I second what @Mar-Z said. `The problem is probably mixing pagination and updating the reader query's criteria`: yes, this is the problem, and multi-threading makes it worse. Use a thread-safe cursor based reader instead. – Mahmoud Ben Hassine Jun 12 '23 at 06:16
  • Thanks @MahmoudBenHassine, by thread-safe cursor based reader, does this pertain to JdbcPagingItemReader – Ooze Jun 13 '23 at 01:38
  • The `JdbcPagingItemReader` is thread-safe, but the problem of changing items matching the search criteria while the job is running will remain even with the paging reader, see https://stackoverflow.com/questions/26509971/spring-batch-jpapagingitemreader-why-some-rows-are-not-read. The `JdbcCursorItemReader` is a good choice for your use case. – Mahmoud Ben Hassine Jun 13 '23 at 05:59