0

I have 1,000,000 records to enter into a database. I have an arrayList of all million records. I then run a for loop to break the arrayList into smaller lists of 1,000 items and do a save all. The first group of 1,000 inserts in under a second, then every batch of 1,000 takes longer and longer to run, within 100,000 records it takes over a minute to save. If I stop the program and pick up where it left off the first batch again inserts in under a second then the time per insert grows. Below is pseudocode

ArrayList<Student> students= getAllRecords();
int skip =1000;

for (int i = (int) (repository.count() + 1); i < students.size(); i += skip) 
{
    List<String> subStudents = students.subList(i, Math.min(i + skip, students.size()));
    studentRepository.saveAll(subStudents);
}
Jens Schauder
  • 77,657
  • 34
  • 181
  • 348
  • Batching is done on a JPA level which is described here: https://stackoverflow.com/questions/10584179/batch-inserts-using-jpa-entitymanager/31020939 – Jens Schauder Jan 17 '20 at 13:36
  • Does this answer your question? [How to do bulk (multi row) inserts with JpaRepository?](https://stackoverflow.com/questions/50772230/how-to-do-bulk-multi-row-inserts-with-jparepository) – Jens Schauder Jan 17 '20 at 13:37

0 Answers0