0

Summary: Inserting 100k rows at a time with 1k batches (yes it's definitely being batched) and time to insert is increasing for each batch. Trying to understand why.

Original Question: JProfiler shows a lot of time spent reading an InputStream in ReadAheadInputStream.fill.

Update: Per comment below from EJP "The time is being spent blocked in read(), while the database server is executing the statement and preparing the response"

What could cause so much time to be spent here?

Details:

  • Inserting rows into a single table in 1k batches.
  • Batches are inside a transaction which triggers flush of Hibernate session.
  • Hibernate and MySQL connection batch properties are enabled and profiling confirms SQL INSERT is indeed batched as expected.
  • As the table grows, the time to insert grows (ie. first 100k inserted in 25 seconds, second 100k in 30, 35, 40, etc).

We have another app (which used JDBC Template) which executes the same INSERT on the same table but does NOT slow down as the table grows. Therefore while db tuning may improve performance, since another app basically does the same insert without performance issues, sub-optimal DB config does not explain the slow down.

Dev Stack:
- Java 8
- Hibernate 4.3.6
- Spring Data JPA 1.7.2
- MySQL 5.6.20
- MySQL driver 5.1.33
- HikariCP 2.3.2

I understand I'm not providing a code sample. I'm basically looking for ideas and a better understanding of what might cause so much time spent in ReadAheadInputStream.fill and below in the call stack.

enter image description here

Justin
  • 6,031
  • 11
  • 48
  • 82
  • Your data doesn't show that. It shows that hardly any time is spent in `checkErrorPacket()`. The time is being spent blocked in a read(). Evidently `checkErrorPacket()` is in the normal flow of reading the response, rather than just being invoked when there *is* an error packet in the response. – user207421 Jul 01 '15 at 03:51
  • @EJP, that's a good point. thanks. i'll update the question – Justin Jul 01 '15 at 03:59
  • This is a similar question but in our case we only have 1 open db connection. http://stackoverflow.com/questions/25356260/mysql-jconnector-spends-50-time-in-com-myql-jdbc-utils-readaheadinputstream-fil – Justin Jul 01 '15 at 04:10
  • Another similar question: http://www.experts-exchange.com/Database/MySQL/Q_23333065.html – Justin Jul 01 '15 at 04:12
  • Well the answer is still the same. The time is being spent blocked in `read()`, while the database server is executing the statement and preparing the response. If you want to know why *this* takes as long as it does, that's what you should be asking about. Percentage of execution time really has nothing to do with it. – user207421 Jul 01 '15 at 04:33
  • Your real question is 'why are the insert batch times increasing?' You've turned it into an XY problem. You should edit your title and remove everything about the profiler results from the question. – user207421 Jul 01 '15 at 04:43
  • @ECP, I appreciate your comments. I've updated the question and also asked another more specific question around the insert statement performance. If the query is the culprit (see othe question) I'll come back and close this question. http://stackoverflow.com/questions/31173928/why-does-performance-of-this-batch-insert-degrade-in-mysql – Justin Jul 02 '15 at 00:53

1 Answers1

0

In the end, we were unable to identify the root cause and fix the degradation issue for batch inserts using JPA/Hibernate. Therefore for large batch inserts we have reverted to using JDBC Template. We no longer see any performance degradation.

Justin
  • 6,031
  • 11
  • 48
  • 82