3

Given a Spring Batch job that uses chunk oriented processing to read data from a file, perform some processing on the data and write the transformed records to the database, is there a way to skip a record in case there is an exception while writing to the database and proceed with the next record? (Some kind of an interceptor?)

Also, as I understand, When a record in a given chunk cannot be written (value greater than length of the database column for example), the entire chunk would fail. What I want is to pass this record to some interceptor that can then try to fix the issue by correcting the value or write the error record to some log file and proceed with the next record rather than failing the batch. I am aware that Spring Batch provides some inbuilt listeners that get triggered on exception but I am unable to figure out how to use them to do what I want.

How do go about achieving this requirement in Spring Batch?

Ping
  • 587
  • 5
  • 27
  • Please refer the following link and see if it helps you. https://www.baeldung.com/spring-batch-skip-logic – Ramu Nov 08 '19 at 06:34

1 Answers1

2

is there a way to skip a record in case there is an exception while writing to the database and proceed with the next record?

Using a fault tolerant step, you can configure a SkipPolicy. Here is an example:

@Bean
public Step step1() {
   return this.stepBuilderFactory.get("step1")
            .<String, String>chunk(10)
            .reader(flatFileItemReader())
            .writer(itemWriter())
            .faultTolerant()
            .skipLimit(10)
            .skip(FlatFileParseException.class)
            .build();
}

In this example, whenever a FlatFileParseException happens, the corresponding item will be skipped. You can find more details in the Configuring Skip Logic section of the reference docs.

What I want is to pass this record to some interceptor that can then try to fix the issue by correcting the value or write the error record to some log file

You can get notified about skipped items by registering a SkipListener. This is the right place to log skipped items to a file for example.

and proceed with the next record rather than failing the batch

When a skippable exception is thrown during the write operation, Spring Batch will scan the chunk for the faulty item (because it can not know which item caused the error). Technically, Spring Batch will set the chunk size to 1 and use one transaction per item, so only the faulty item will be rolled back. This allows you to achieve the requirement above. You can find a code example here.

Mahmoud Ben Hassine
  • 28,519
  • 3
  • 32
  • 50
  • Thanks for the quick response; however, what I am looking for is a way to write the record to the database by fixing the error. For example, let's assume that a record can't be written because a particular field value is larger than the size of the column in the database table. In this case, if the column is not that important (we know the list of columns that are important), I can truncate the value to the size accepted by the DB and move on instead of failing my entire batch for a column that is not so important. I couldn't find a way to intercept the step and handle the error record. – Ping Nov 08 '19 at 10:29
  • In this case, it is not as skip, it is rather a retry: you are retrying to write the item (after reprocessing it to truncate it). So you would need to configure a `RetryPolicy`. – Mahmoud Ben Hassine Nov 08 '19 at 12:37
  • Noted. Would a retry policy require saveState=true? Does using a using `RetryPolicy` rely on jobs to be written in a certain way? In other words, what are the prerequisite for using RetryPolicy? We are using features of spring abtch in a way that we can't use options like saveState=true. – Ping Nov 08 '19 at 13:17
  • the retry policy will be applied during the same run of the job, it is not related to the state saved in the job repository – Mahmoud Ben Hassine Nov 08 '19 at 13:47
  • I am unable to find good examples to do this. How do I correct the value of my item using the retry policy? As I understand, a retry would just rerun the processor and writer? How do I know which item failed and where do I truncate its value. Can you point me to an example that demonstrates this as this particular use case is tough to find.. – Ping Nov 08 '19 at 15:29
  • Sorry I don't have an example for such a use case. – Mahmoud Ben Hassine Nov 08 '19 at 20:31
  • Fair enough. In the second part of my question, I ask about a way to fix the error record and write it back to the database. I understand from your comments that I can use a `RetryPolicy`. The only question remains is how to get the problematic item? Unlike the `SkipListener`, A `RetryListener` doesn't provide the item that caused the issue. A complete example is not required but can you let me know what API I can use to get a reference to the item that caused the error? – Ping Nov 09 '19 at 02:58
  • `A RetryListener doesn't provide the item that caused the issue.` : have you tried to see the content of `RetryContext` ? – Mahmoud Ben Hassine Nov 12 '19 at 10:49
  • Thanks for all the inputs to all the questions I have posted on SO till date. The reason for not accepting this answer is that we did not try this approach so I am unsure if it works or not as we did not end up trying this approach. I am accepting the answer for now. – Ping Jun 07 '21 at 16:52