2

I am attempting to do a bulk insert into MySQL using

INSERT INTO TABLE (a, b, c) VALUES (?, ?, ?), (?, ?, ?)

I have the general log on, and see that this works splendidly for most cases. However, when the table has a BLOB column, it doesn't work as well.

I am trying to insert 20 records.

Without the BLOB, I see all 20 records in the same query in the general log, 20 records inserted in the same query.

WITH the BLOB, I see only 2 records per query in the general log, it takes 10 queries in total.

Is this a problem with MySQL, the JDBC Driver, or am I missing something else. I would prefer to use a BLOB as I have data in protobufs.

Here is an example table...

CREATE TABLE my_table (
  id CHAR(36) NOT NULL,
  name VARCHAR(256) NOT NULL,
  data BLOB NOT NULL,
  PRIMARY KEY (id)
);

Then, create your batch inserts in code...

val ps = conn.prepareStatement(
  "INSERT INTO my_table(id, name, data) VALUES (?, ?, ?)")
records.grouped(1000).foreach { group =>
  group.foreach { r =>
    ps.setString(1, UUID.randomUUID.toString)
    ps.setString(2, r.name)
    ps.setBlob(3, new MariaDbBlob(r.data))
    ps.addBatch()
  }
  ps.executeBatch()
}

If you run this and inspect the general log, you will see...

"2018-10-12T18:37:55.714825Z 4 Query INSERT INTO my_table(id, name, fqdn, data) VALUES ('b4955537-2450-48c4-9953-e27f3a0fc583', '17-apply-test', _binary ' 17-apply-test\"AAAA(?2Pending8?????,J$b4955537-2450-48c4-9953-e27f3a0fc583

1:2:3:4:5:6:7:8Rsystem'), ('480e470c-6d85-4bbc-b718-21d9e80ac7f7', '18-apply-test', _binary ' 18-apply-test\"AAAA(?2Pending8?????,J$480e470c-6d85-4bbc-b718-21d9e80ac7f7

1:2:3:4:5:6:7:8Rsystem') 2018-10-12T18:37:55.715489Z 4 Query INSERT INTO my_table(id, name, data) VALUES ('7571a651-0e0b-4e78-bff0-1394070735ce', '19-apply-test', _binary ' 19-apply-test\"AAAA(?2Pending8?????,J$7571a651-0e0b-4e78-bff0-1394070735ce

1:2:3:4:5:6:7:8Rsystem'), ('f77ebe28-73d2-4f6b-8fd5-284f0ec2c3f0', '20-apply-test', _binary ' 20-apply-test\"AAAA(?2Pending8?????,J$f77ebe28-73d2-4f6b-8fd5-284f0ec2c3f0

As you can see, each INSERT INTO only has 2 records in it.

Now, if you remove the data field from the schema and insert and re-run, you will see the following output (for 10 records)...

"2018-10-12T19:04:24.406567Z 4 Query INSERT INTO my_table(id, name) VALUES ('d323d21e-25ac-40d4-8cff-7ad12f83b8c0', '1-apply-test'), ('f20e37f2-35a4-41e9-8458-de405a44f4d9', '2-apply-test'), ('498f4e96-4bf1-4d69-a6cb-f0e61575ebb4', '3-apply-test'), ('8bf7925d-8f01-494f-8f9f-c5b8c742beae', '4-apply-test'), ('5ea663e7-d9bc-4c9f-a9a2-edbedf3e5415', '5-apply-test'), ('48f535c8-44e6-4f10-9af9-1562081538e5', '6-apply-test'), ('fbf2661f-3a23-4317-ab1f-96978b39fffe', '7-apply-test'), ('3d781e25-3f30-48fd-b22b-91f0db8ba401', '8-apply-test'), ('55ffa950-c941-44dc-a233-ebecfd4413cf', '9-apply-test'), ('6edc6e25-6e70-42b9-8473-6ab68d065d44', '10-apply-test')"

All 10 records are in the same query

  • Please post a [mcve]. – Mark Rotteveel Oct 12 '18 at 18:33
  • You should try JDBC's batch syntax, which allows you to use a static insert statement which you reuse in an iterative manner. [This thread](https://stackoverflow.com/questions/2993251/jdbc-batch-insert-performance) has more info, also regarding optimization parameters. – Mick Mnemonic Oct 12 '18 at 18:35
  • @MarkRotteveel added detail – Paul Cleary Oct 12 '18 at 19:09
  • @MickMnemonic I posted the batch syntax, I think it is correct. Did rewriteBatchedStatements as well. The latter example demonstrates bulk inserts work as expected when a BLOB is not used – Paul Cleary Oct 12 '18 at 19:10
  • Are you using MySQL or are you using MariaDB? And which JDBC driver (+ version) are you using? – Mark Rotteveel Oct 12 '18 at 19:12
  • @MarkRotteveel I found out that the odd issue was with using `.setBlob` in the `PreparedStatement`. If I used `setBytes` everything worked fine as expected. Verified the same in the general log. – Paul Cleary Oct 14 '18 at 14:45

1 Answers1

0

I tinkered until I found the fix...

val ps = conn.prepareStatement(
  "INSERT INTO my_table(id, name, data) VALUES (?, ?, ?)")
records.grouped(1000).foreach { group =>
  group.foreach { r =>
  ps.setString(1, UUID.randomUUID.toString)
  ps.setString(2, r.name)
  //ps.setBlob(3, new MariaDbBlob(r.data))
  ps.setBytes(r.data)
  ps.addBatch()
}
ps.executeBatch()

Using PreparedStatement.setBytes instead of using MariaDbBlob seemed to do the trick