I am attempting to do a bulk insert into MySQL using
INSERT INTO TABLE (a, b, c) VALUES (?, ?, ?), (?, ?, ?)
I have the general log on, and see that this works splendidly for most cases. However, when the table has a BLOB column, it doesn't work as well.
I am trying to insert 20 records.
Without the BLOB, I see all 20 records in the same query in the general log, 20 records inserted in the same query.
WITH the BLOB, I see only 2 records per query in the general log, it takes 10 queries in total.
Is this a problem with MySQL, the JDBC Driver, or am I missing something else. I would prefer to use a BLOB as I have data in protobufs.
Here is an example table...
CREATE TABLE my_table (
id CHAR(36) NOT NULL,
name VARCHAR(256) NOT NULL,
data BLOB NOT NULL,
PRIMARY KEY (id)
);
Then, create your batch inserts in code...
val ps = conn.prepareStatement(
"INSERT INTO my_table(id, name, data) VALUES (?, ?, ?)")
records.grouped(1000).foreach { group =>
group.foreach { r =>
ps.setString(1, UUID.randomUUID.toString)
ps.setString(2, r.name)
ps.setBlob(3, new MariaDbBlob(r.data))
ps.addBatch()
}
ps.executeBatch()
}
If you run this and inspect the general log, you will see...
"2018-10-12T18:37:55.714825Z 4 Query INSERT INTO my_table(id, name, fqdn, data) VALUES ('b4955537-2450-48c4-9953-e27f3a0fc583', '17-apply-test', _binary ' 17-apply-test\"AAAA(?2Pending8?????,J$b4955537-2450-48c4-9953-e27f3a0fc583
1:2:3:4:5:6:7:8Rsystem'), ('480e470c-6d85-4bbc-b718-21d9e80ac7f7', '18-apply-test', _binary ' 18-apply-test\"AAAA(?2Pending8?????,J$480e470c-6d85-4bbc-b718-21d9e80ac7f7
1:2:3:4:5:6:7:8Rsystem') 2018-10-12T18:37:55.715489Z 4 Query INSERT INTO my_table(id, name, data) VALUES ('7571a651-0e0b-4e78-bff0-1394070735ce', '19-apply-test', _binary ' 19-apply-test\"AAAA(?2Pending8?????,J$7571a651-0e0b-4e78-bff0-1394070735ce
1:2:3:4:5:6:7:8Rsystem'), ('f77ebe28-73d2-4f6b-8fd5-284f0ec2c3f0', '20-apply-test', _binary ' 20-apply-test\"AAAA(?2Pending8?????,J$f77ebe28-73d2-4f6b-8fd5-284f0ec2c3f0
As you can see, each INSERT INTO only has 2 records in it.
Now, if you remove the data
field from the schema and insert and re-run, you will see the following output (for 10 records)...
"2018-10-12T19:04:24.406567Z 4 Query INSERT INTO my_table(id, name) VALUES ('d323d21e-25ac-40d4-8cff-7ad12f83b8c0', '1-apply-test'), ('f20e37f2-35a4-41e9-8458-de405a44f4d9', '2-apply-test'), ('498f4e96-4bf1-4d69-a6cb-f0e61575ebb4', '3-apply-test'), ('8bf7925d-8f01-494f-8f9f-c5b8c742beae', '4-apply-test'), ('5ea663e7-d9bc-4c9f-a9a2-edbedf3e5415', '5-apply-test'), ('48f535c8-44e6-4f10-9af9-1562081538e5', '6-apply-test'), ('fbf2661f-3a23-4317-ab1f-96978b39fffe', '7-apply-test'), ('3d781e25-3f30-48fd-b22b-91f0db8ba401', '8-apply-test'), ('55ffa950-c941-44dc-a233-ebecfd4413cf', '9-apply-test'), ('6edc6e25-6e70-42b9-8473-6ab68d065d44', '10-apply-test')"
All 10 records are in the same query