1

When INSERTING does the order of columns in the SQL statement make any difference? Particularly with regard to BLOB types?

I'm assuming the answer is NO but the answers I've found so far are focused on SELECTS query performance. Example 1, Example 2

DETAILS:

There has been a refactor of an application using this DB and I'm seeing insert performance degrade as the table grows in rows. I did not see any degradation in insert performance previously.

  • The MySQL db is literally the same between the old app and the new app: same config settings, same table structure. Therefore while DB config optimizations may be applicable here, at the moment I'm trying to understand why there is a difference in INSERT query performance.
  • On the app side it's the same db driver between old and new app
  • There is a difference in the persistence code (JDBC Template vs Hibernate) and one difference I see is the SQL INSERT statements were previously generated with the BLOB column (longblob) as the last column and now the SQL INSERT is generated with BLOB as one of the columns in the middle of the statement.
  • The data being inserted is for the most part the same between the old and new app.
  • Commits are being issued in the same manner and frequency.

Yes, I understand there are other factors at play in this scenario but I figured I'd just confirm INSERT column ordering does not matter when inserting BLOB types.

EDIT 1: By performance degradation I mean the following:

  • The time it takes to insert 100k rows (in 1k per commit) increases from about 25 seconds per 100k to 27, 30, 35 seconds for each new 100k batch. Since this is a test db I am able to clear the data as needed.
  • I did not see this behavior before and still do not see it on HSQL.
  • I'm measuring time to insert 100k on the app side.

Edit 2: I'm using version 5.6.20 so Explain is not available on INSERT. However, I did use show profile as described here. Oddly MySQL reports the duration of the new query as much faster than the old query (which I am struggling to understand)

New query duration: 0.006389
Old query duration: 0.028531

Community
  • 1
  • 1
Justin
  • 6,031
  • 11
  • 48
  • 82
  • In what way are you “seeing insert performance degrade”? Do you have any numbers/statistical data/measurements to back that up? – CBroe Jun 27 '15 at 00:35
  • There could conceivably be a difference if the columns that are now after the BLOB are indexed. – user207421 Jun 27 '15 at 00:47
  • _“I'm measuring time […] on the app side”_ – then it might(!) not reflect how long it actually takes on the database side – so your change of DB driver could actually cause the effect (f.e. if it manages memory differently or sth. like that). Try and measure on the database side. – CBroe Jun 27 '15 at 10:33
  • @EJP: He said, the database structure was the same – so the columns did not actually change order in the database, only in the statement. I doubt this should have any influence on index usage – but to make sure, Justin, could you run `EXPLAIN` on both one of the old-style and the new statements? – CBroe Jun 27 '15 at 10:37
  • @CBroe, see edit regarding EXPLAIN. What is best/simplest way to measure on the DB side? Regarding the original question, if you are certain column order does not make a difference in INSERT performance, feel free to answer and i'll accept and maybe address remaining issue in separate SO question. – Justin Jun 27 '15 at 23:51
  • No, it does not. The way the data is stored is independent of column order, performance is independent of column order. Selecting in the order of the columns as they exist in the table might (MIGHT) give a slight performance gain since the columns to not have to be ordered. This MIGHT will be mighty tiny. – Norbert Jun 27 '15 at 23:53
  • @CBroe Like everybody else, I'm talking about the column order in the statement, not in the database. What the question is about. I don't know why you would think otherwise. I can't conceive of a circumstance in which the order of columns in the database would matter, but I can conceive that sending a large blob before indexable columns could matter. – user207421 Jun 28 '15 at 01:01
  • @EJP: _“but I can conceive that sending a large blob before indexable columns could matter”_ – I _can’t_ see why that should matter. The SQL statement is only the “transport vehicle” – it will be parsed and disassembled by the query parser, and after that the database will run it’s own optimizations in any case, before executing the statement. So it will make its own decisions on how to handle indexes etc. That the order of columns in the original statement would influence that in any significant way, is IMHO highly unlikely. – CBroe Jun 28 '15 at 01:06
  • Plus, that OP now states the database reports the new statement as executing _faster_ than the original one, suggests even more that the method of “measuring” applied here to come to the original conclusion that this was now somehow slower due to the new order of columns in the query does not reflect what it actually going on. – CBroe Jun 28 '15 at 01:07

0 Answers0