I am attempting to improve the performance of my application in which one of the operations is to read data from a CSV file and store the values from each row as one POJO (so 1500 CSV rows = 1500 POJOs) in a PostgresSQL database. It is a spring boot application and uses a JpaRepository with (default configurations) as the means for persistence. My original attempt was basically this statement in each iteration of the loop as it read each row in the CSV file:
autowiredRepoInstance.save(objectInstance);
However with the spring.jpa.show-sql=true
setting in the application.properties
file, I saw that there was one insert being done for each POJO. My attempt at improving the performance was to declare an ArrayList outside the loop, save each instance of the POJO in that list within the loop, and at every 500th item, perform a save, as below (ignoring for now the cases where there are more/less than multiples of 500):
loop(
objList.add(objectInstance);
if (objList.size() == 500) {
autowiredRepoInstance.save(objList);
objList.clear();
}
)
However, this was also generating individual insert statements. What settings can I change to improve performance? Specifically, I would like to minimize the number of SQL statements/operations, and have the underlying Hibernate use "multirow" inserts that postgresql allows:
https://www.postgresql.org/docs/9.6/static/sql-insert.html
But any other suggestions are also welcomed.
Thank you.