2

I have a simple csv file that i am reading in chunk of 1000, inserting to database. Now if i want to check if the row exists in db and is equal to input before insert, if exists and row is equal -ignore, else insert or update, I am using ItemProcessor. After implementing this, realized the jdbc call is too slow(120ms avg) and wanted to batch the ids before calling the db and check with input. In this stage ItemReader was passing to ItemProcessor one item at a time. Now I am trying to pass 1000 items at once to ItemProcessor so the jdbc call can be batched. While trying this, was able to check some example but unable to get reader work. This is sample code.

public class customReader implements ItemReader<List<T>> {
private static List<T> records = null;
ItemReader<String> itemReader;

@Autowired customDao customDao; 
private int index = 0; 
@Override public List<T> read() throws Exception {
//reader logic
//while(records.size() < 1000){
      String record = itemReader.read();
      if(Objects.isNull(record)){
          break;
      } 
      records.add(record);
 }
 return (List<T>) records;

}

This is config

@Bean
    return stepBuilderFactory
            .get("step")
            .<List<String>, List<String>>chunk(1000)
            .reader(reader())
            .processor(processor())
            .writer(writer())
            .build();

If any one has a simple sample where they can pass a list of 1000 rows of csv file to processor, please share the example. Checked the example shared: Making a item reader to return a list instead single object - Spring batch Getting exceptions, unchecked call to processor, writer etc with above.

Spring Batch - Item Reader and ItemProcessor with a list Checked above as well but got exceptions like listed in the comment and unchecked type etc. Please kindly share if you have working sample of reader, processor and writer with multiple rows in one transaction without multi-threading.

tryCatch
  • 23
  • 1
  • 6
  • Don't. Use your database for this and write a SQL statement to do this, write a so-called UPSERT query (UPDATE or INSERT), that way this will all be done in 1 batched query instead of introducing an additional query per row (which will be slow). – M. Deinum Feb 02 '22 at 14:49

1 Answers1

0

From, the question it appeared that you are having problem in reading the List using your custom ItemReader (plz correct me if there is any other problem).First of all can you post the structure of ItemProcessor and ItemWriter that would be helpful to get more insight of the problem.

Just to throw more light on how spring batch work, you read one item or multiple items at a time using reader .But, batch framework sends one item at a time to processor(even if it has prepared a list of item by reading in batch) and writer will receive list of output from processor to write in target(based on chunk size).If you are reading List from reader and returning same list from reader then processor will receive that List to process but writer should have List<List> to write. In a way , batch will consider List that you have read and passed to processor as 1 single item and based on chunk size, if it is 10 (lets say), it will add 10 List in other list and send that List<List> to writer.

Ankit Gautam
  • 125
  • 1
  • 11