Requirement:
- Please consider a spring batch application.
- Input is a file containing a column of values.
- The Spring Batch is designed as chunk oriented design.
- Chunk is designed to take 1000 records at a time
- Therefore, Reader reads 1000 records from a file in microseconds.
- Processor takes one record at a time and triggers the SQL query:
select * from TABLE where COLUMN2 = "record"
There may be only one record or multiple records retrieved and those records go through some business logic.
- In writer, we accumulate all the records passed by the business logic (number of records will be less than 1000) and inserts into the database.
Problem here: Consider the table has almost 400K records stored.
While reading 1000 records from the file, it takes few microseconds.
While processing the 1000 records (that means, hitting the above SQL query for 1000 times in the database), it takes 4 minutes to process.
While writing into the database (insertion of for example 100 selected records), it takes few microseconds.
While analyzing, I found that there is only the Primary Key column indexed in the table. The column that we are using (column2) is not included as an indexed column.
Please advise, whether adding a column as an index is a better solution to this.