21

I've been messing around with golang's sql package with transactions, and I'm trying to understand how to do bulk upserts without the "per insert" round trip communication for each row. The examples here don't really show how any bulk queries would be done.

updateMoney, err := db.Prepare("INSERT INTO balance set money=?, id=? ON DUPLICATE UPDATE balance SET money=money+? WHERE id=?")
...
tx, err := db.Begin()
...
res, err := tx.Stmt(updateMoney).Exec(123.45, 1, 123.45, 1)
res, err := tx.Stmt(updateMoney).Exec(67.89, 2, 67.89, 2)
res, err := tx.Stmt(updateMoney).Exec(10.23, 3, 10.23, 3)
...
tx.Commit()

Ideally, I'd be able to take a prepared query, and build up a list of upserts to be sent at the same time... but here, we get a result back from the database after each execution. Any suggestions on how to go about this?

Edit: My coworker found this open ticket that describes the problem... it looks to be a larger concern than strictly within the context of a transaction.

tbischel
  • 6,337
  • 11
  • 51
  • 73

2 Answers2

10

It depends on which driver you are using, some drivers / databases don't support transactions at all.

For example go-sql-driver/mysql supports transactions just fine.

Your code should work, or you could change it a little to:

tx, err := db.Begin()
...
stmt, err := tx.Prepare(`INSERT INTO balance set money=?, id=? ON DUPLICATE UPDATE balance SET money=money+? WHERE id=?`)
res, err := stmt.Exec(123.45, 1, 123.45, 1)
res, err := stmt.Exec(67.89, 2, 67.89, 2)
res, err := stmt.Exec(10.23, 3, 10.23, 3)
...
tx.Commit()

Also check this answer which goes into a lot of details about transactions.

Community
  • 1
  • 1
OneOfOne
  • 95,033
  • 20
  • 184
  • 185
  • 1
    So I think the problem I am trying to verbalize is that I'd like the statement execution to be deferred until the commit, rather than all the communication overhead associated with individual statements each requiring communication with the database. It is the bulk execution that seems to be missing. I'll add more above – tbischel Aug 14 '14 at 00:18
  • @tbischel Ah, I see what you mean now, is there any real reason you want that implemented or just for testing? because the actual data data don't get commited until after you call commit. – OneOfOne Aug 14 '14 at 00:32
  • yeah we are writing an aggregation tool that is likely to have thousands of rows per batch to be updated in the transaction... we'd like to limit network overhead associated with the transaction, and also limit the time that the transaction remains open. – tbischel Aug 14 '14 at 03:56
  • Build a slice of query parameters, and then once ready, open a transaction and range over the slice before committing? – elithrar Aug 14 '14 at 10:03
  • we were considering something a bit similar to that... writing a wrapper transaction package that constructs a really long single statement with multiple upserts in it... and provide a deferred execution to the time of the commit. – tbischel Aug 14 '14 at 16:57
  • @tbischel keep in mind that the network overhead is minimal because the connection remains open regardless of what you do, one other option is a "massive" insert statement but that wouldn't work with your `ON DUPLICATE`. It doesn't open a new connection for each `Exec`. – OneOfOne Aug 14 '14 at 17:01
4

So while we weren't able to do separate queries for each upsert, we did write valid mysql for bulk upserts in a single statement...

INSERT INTO balance (id, money) VALUES (1, 123.45), (2, 67.89), (3, 10.23)
ON DUPLICATE KEY UPDATE money = money + values(money)

So for anyone who wants to do a bulk upsert, there is a pattern to work around any limitations of the mysql driver.

tbischel
  • 6,337
  • 11
  • 51
  • 73