3

I am extending this How does Spring Batch CompositeItemWriter manage transaction for delegate writers? question here:

In my case I've a below CompositeItemWriter which writes data into the multiple tables of same database, before writing data it transforms data by implementing various business rules. Here one record may satisfy differet business rules etc. Hence one writer may get more data than others.

@Bean
public CompositeItemWriter<Employee> EmployeeCompositeWriter() throws Exception {
    List<ItemWriter<? super Employee>> employee = new ArrayList<>();
    employee.add(employeeWriter());
    employee.add(departmentWriter());
    employee.add(stockWriter());
    employee.add(purchaseWriter());

    CompositeItemWriter<Employee> compositeItemWriter = new CompositeItemWriter<>();
    compositeItemWriter.setDelegates(employee);
    compositeItemWriter.afterPropertiesSet();
    return compositeItemWriter;
}

Scenario - Assume 1st writer works very well, 2nd writer generates exception, then 3rd and 4th writers are not getting called this is what the Automic nature defaulted in Spring Batch happening due to Transaction roll back.

Here even if any exception arises at 2nd writer, I want to successfully call the 3rd and 4th writers and save the data, I also wanted to successfully save the data of 1st writer and 2nd writers.. only exception data I want to store into the Error Table with the help of SkipListener to identify which records was junk or garbage.

Solution - To achive above scenario, we've added @Transactional(propagation = Propagation.REQUIRES_NEW) on each writers write method, 1st writer saved the data now and 2nd writer generates exception (using namedJdbcTemplate.batchUpdate() to bulk update data) we're caching it and rethrowing it, but we could see commit level is reduced to 1 (off-course to identify extact garbage record) and the moment exception arises from 2nd writer again 1st writer is getting called and it's saving the duplicate data and writer 2nd, 3rd and 4th is getting called, but also that junk record is not flowing to 3rd and 4th writer.

Here I dont want the whole Batch Job to stop if single or couple of records are garbage, because this Job is critical for us to run everytime. Is there any way if we can save all the data where exception doesn't arise and only save exception data into the error table with the help of SkipListener if possible or any other way?

Is there any way if we can reused the Batch Components like (READER or PROCESSOR)part of any step into another step ?

Jeff Cook
  • 7,956
  • 36
  • 115
  • 186
  • Nothing is skipped so why would a skiplistener help in this case? Seems like you are better of writing your own single writer that does what you want in a single pass. From a batch point of view there is just a single writer (not multiple) and everything should (generally) be successful or not (that is the nature of writing a batch record). – M. Deinum Aug 07 '20 at 09:35
  • From the viewpoint of the batch nothing is being skipped. A skip is a fully skipped write not a partial one as you want. That is not how the skiplistener works. – M. Deinum Aug 07 '20 at 11:13

3 Answers3

1

I can't see a way you could align spring-batch's single transaction for writing the whole chunk as atomic vs your idea of keeping the atomicity to individual writers as long as you want skiplistener.

I am not sure if this is possible but may be you will be able to test it quickly. This is how the message carries the exception in some integration frameworks like camel from one processor to error handling flow.

  • You item reader should return a EmployeeWrapper which contains employee record and has a field to store Exception.

  • your CompositeItemWriter receives List<EmployeeWrapper> and composite writer has 5 writers instead of 4. And the 5th writer will do what your SkipListener would have done.

    List<ItemWriter<? super EmployeeWrapper>> employee = new ArrayList<>();
    employee.add(employeeWriter());
    employee.add(departmentWriter());
    employee.add(stockWriter());
    employee.add(purchaseWriter());
    employee.add(errorRecordWriter());
  • Your first 4 individual writers never throw exception, instead mark it as processed but add the caught exception as attribute of EmployeeWrapper.

  • Your 5th errorRecordWriter receives all the records, check any record that has exception attribute added and writes them to error table. Incase it failed to write error record, you can throw the exception and all 5 writers will be retried.

  • Regarding how you would know which record is error record when the batch update fails. It seems when an error occur in chunk, spring rollbacks the chunk and start retrying it record by record in that chunk so it knows which record is the problematic. So you can do the same thing in your individual writers. I.e catch the batch update exception and then retry them one by one to separate the error records

  • I am using Custom Writers to batch update the records in the chunk of 2000. JdbcTemplate/NamedParameter JdbcTemplate doesn't tell which is the junk/garbage data unless I dont throw it. How can I will be able to make the use of error records here ? – Jeff Cook Aug 09 '20 at 19:44
  • Hi Kavitha - In our case we had two different writer who were inserting the data into the same table since data we'were deriving was different. Now we've combine these 2 writers and it resulted into smooth transaction works OOTB provided by Spring Batch Say if STOCK_ID = 1 failed for one writer then that item is failed for all other writer since we're using the SkipListener to skip the records, this was we're getting the consistent behavior. Here we dont need to use Custom CompositeItemWriter , its working well with the inbuilt API. Thanks for your great help and please revert ! – Jeff Cook Aug 15 '20 at 06:59
1

A couple things here:

  1. Do not use @Transactional with Spring Batch - Spring Batch manages the transactions for you so using that annotation will cause issues. Do not use it.
  2. Manage the exceptions yourself - In the scenario you are describing, where you want to call four ItemWriter implementations for the same item, but want to skip the exceptions at the delegated ItemWriter level, you will need to write your own CompositeItemWriter implementation. Spring Batch provides that level of composition (where we delegate to each ItemWriter implementation with the same item) out of convenience, but from the framework's perspective it is just a single ItemWriter. In order to handle exceptions at the child ItemWriter level, you will need to write your own wrapper and manage the exceptions yourself.

UPDATE:
An example implementation of the custom ItemWriter I'm referring to (note the code below is untested):

public class MyCompositeItemWriter<T> implements ItemWriter<T> {
      private List<ItemWriter<? super T>> delegates;
 
    @Override
      public void write(List<? extends T> items) throws Exception {
            for(ItemWriter delegate : delegates) {
               try {
                  delegate.write(items);
               }
               catch (Exception e) {
                  // Do logging/error handling here
               }
            }
    }

    @Override
    public void setDelegates(List<ItemWriter<? super T>> delegates) {
        super.setDelegates(delegates);
        this.delegates = delegates;
    }
}
Michael Minella
  • 20,843
  • 4
  • 55
  • 67
  • If I dont put the `@Transactional(propagation=REQUIRED_NEW)` then 2nd writer will never get called if 1st writer failed to save the record. So I think this is needed. Yes, I developed Custom CompositeItemWriter where I need to look for commit-interval to take decision for success scenarios and manage the index position of the delegate to let the Custom Composite ItemWriter to start the loop from that delegate if earlier delegate was succesful. – Jeff Cook Aug 11 '20 at 17:45
  • Again, it is not. You need to catch the exception within your delegate. If you use that annotation it will break how transactions work with Spring Batch. – Michael Minella Aug 11 '20 at 18:01
  • I have added all my findings https://github.com/mminella/scaling-demos/issues/6, could you please verify it once ?. Please do the needful. – Jeff Cook Aug 11 '20 at 18:06
  • BatchUpdate doesn't give facility which records are garbage and hence I get all the 500 records which is my chunk size. I dont have a way to identify which are garbage records and even if I dont throw exception, other writers are not getting called. Could you please show some code how better we can do here ? I agree playing with transaction is hugly complex and risky, but I need code to understand more out of your comments. Also please guide here if this look good github.com/mminella/scaling-demos/issues/6 – Jeff Cook Aug 11 '20 at 18:09
  • @mminella - In our case we had two different writer who were inserting the data into the same table since data we'were deriving was different. Now we've combine these 2 writers and it resulted into smooth transaction works OOTB provided by Spring Batch Say if STOCK_ID = 1 failed for one writer then that item is failed for all other writer since we're using the SkipListener to skip the records, this was we're getting the consistent behavior. Here we dont need to use Custom CompositeItemWriter , its working well with the inbuilt API. Thanks for your great help and please revert ! – Jeff Cook Aug 15 '20 at 06:58
0

The main root cause of the issue was, we were trying to use two different ItemWriter to write the data into the same table and this was causing the Transaction to behave weirdly.

We've implemented SkipListenets (considering the fact that use may not often get the garbage or junk data as we're performing the validation at the initial data load.)

Since we’ve implemented “Spring Batch Skip Technique” in the Batch jobs, this help us to specify certain exception types and a maximum no. of skipped items and whenever one of those skippable exception is thrown, batch job doesn’t fail but skip that particular item and goes onto the next item. Only when maximum no. of skipped items is reached, batch job will fail. We’ve use skip logic with the “Fault tolerance” features of Spring Batch are applied to items in chunk-oriented steps, not to the entire step.

Therefore if Item failed to write at one delegate then it will be considered failed for all other delegates (that item will not be passed to another delegates) and we're fine with it because we're capturing the details in the error log table and from there we can reprocess it as when needed.

Jeff Cook
  • 7,956
  • 36
  • 115
  • 186