29

Is there a way I can neatly do an upsert operation in Slick? The following works but is too obscure/verbose and I need to explicitly state the fields that should be updated:

val id = 1
val now = new Timestamp(System.currentTimeMillis)
val q = for { u <- Users if u.id === id } yield u.lastSeen 
q.update(now) match {
  case 0 => Users.insert((id, now, now))
  case _ => Unit
}
Synesso
  • 37,610
  • 35
  • 136
  • 207

2 Answers2

38

Updated for native upsert/merge support in Slick 2.1

Attention

You have to use plain SQL embedding with your database native MERGE statement. All trials to simulate this statement will very likely lead to incorrect results.

Background:

When you simulate the upsert / merge statement, Slick will have to use multiple statements to reach that goal (e.g. fist a select and then either an insert or an update statement). When running multiple statements in a SQL transaction, they usually doe not have the same isolation level as one single statement. With different isolation levels, you will experience strange effects in massive concurrent situations. So everything will work fine during the tests and fail with strange effects in production.

A database usually has a stronger isolation level while running one statement as between two statements in the same transaction. While one running statement will not be affected by other statements that run in parallel. The database will either lock everything the statement touches or it will detect interefence between running statements and automatically restart the problematic statements when necessary. This level of protection does not hold, when the next statement in the same transaction is executed.

So the following scenario may (and will!) happen:

  1. In the first transaction the select statement behind user.firstOption doesn't find a database row for the current user.
  2. A parallel second transaction inserts a row for that user
  3. The first transaction inserts a second row for that user (similar to a phantom read)
  4. You either end with two rows for the same user or the first transaction fails with a constraint violation although its check was valid (when it ran)

To be fair, this will not happen with the isolation level "serializable". But this isolation level is comes with a huge performance hit is rarely used in production. Additionally serializable will need some help from your application: The database management system will usually not really serializable all transaction. But it will detect violations against the serializable requeirement and just abort the transactions in trouble. So your application must be prepared for rerunning transaction that are aborted (randomly) by the DBMS.

If you rely on the constraint violation to occur, design your application in a way that it will automatically rerun the transaction in question without bothering the user. This is similar to the requirement in isolation level "serializable".

Conclusion

Use plain SQL for this scenario or prepare for unpleasant surprises in production. Think twice about possible problems with concurrency.

Update 5.8.2014: Slick 2.1.0 has now native MERGE support

With Slick 2.1.0 there is now native support for the MERGE statement (see the release notes: "Insert-or-update support which makes use of native databases features where possible").

The code will look like this (taken from the Slick test cases):

  def testInsertOrUpdatePlain {
    class T(tag: Tag) extends Table[(Int, String)](tag, "t_merge") {
      def id = column[Int]("id", O.PrimaryKey)
      def name = column[String]("name")
      def * = (id, name)
      def ins = (id, name)
    }
    val ts = TableQuery[T]

    ts.ddl.create

    ts ++= Seq((1, "a"), (2, "b")) // Inserts (1,a) and (2,b)

    assertEquals(1, ts.insertOrUpdate((3, "c"))) // Inserts (3,c)
    assertEquals(1, ts.insertOrUpdate((1, "d"))) // Updates (1,a) to (1,d)

    assertEquals(Seq((1, "d"), (2, "b"), (3, "c")), ts.sortBy(_.id).run)
  }
stefan.schwetschke
  • 8,862
  • 1
  • 26
  • 30
  • 2
    Great answer and thanks for the update based on slick 2.1.0. I almost missed your update because it is at the end of your detailed answer based on slick < 2.1.0. It might be helpful if you add something to the top of the answer pointing people to your update at the end. – drstevens Sep 16 '14 at 21:03
  • 1
    Is there a convenient way (Slick 3.0) to do batch `insertOrUpdate`? That is without Action sequence of single `insertOrUpdate`? – User Jul 30 '15 at 18:58
  • @Ixx Please don't hijack the thread. Better post a separate (followup) question. But of course you can post a link to your question here. – stefan.schwetschke Jul 31 '15 at 08:55
  • The question is called "Upsert in Slick", and it doesn't have a version number, why is it hijacking? – User Jul 31 '15 at 10:06
  • @Ixx Sorry, perhaps I used the wrong wording. There was no accusation intended. It's just from a pragmatic view: You would probably get better answers if you would create a new question. It's just how people usually do it here and usually it works fine. – stefan.schwetschke Aug 01 '15 at 11:41
  • Ok, no problem - I'm aware that you can't edit endlessly this answer for all possible variants and upcoming versions of Slick... I'll open a new question if I need this again. – User Aug 01 '15 at 13:43
  • @Ixx that's not the problem. I would happily update the answer. It's just that I don't know the right answer for your problem. That's why I think that a top level Wilson will give you a better result than a comment in that thread... – stefan.schwetschke Aug 01 '15 at 23:24
  • Note - the insertOrUpdate apparently [only works on MySQL](https://scala-slick.org/doc/3.2.1/supported-databases.html) – Jethro Jan 02 '20 at 12:56
1

Apparently this is not (yet?) in Slick.

You might however try firstOption for something a bit more idiomatic:

val id = 1
val now = new Timestamp(System.currentTimeMillis)
val user = Users.filter(_.id is id)
user.firstOption match {
  case Some((_, created, _)) => user.update((id, created, now))
  case None => Users.insert((id, now, now))
}
OlivierBlanvillain
  • 7,701
  • 4
  • 32
  • 51
  • 7
    This will not do what you think it does. Usually databases do not run on the isolation level "serializable", but in something like "read committed". So while each SQL statement is well isolated, two statements in the same transaction may see different data. So the "select" that is synthesized for "user.firstOption" may not see any matching database row. But a parallel transaction might insert one and even commit. When the insert statement runs, it will insert a second matching row. This is possible with two separate statements. With one (!) upsert/merge statement, this situation cannot occur. – stefan.schwetschke Sep 24 '13 at 14:01
  • I thought that transaciton where atomic (cf http://en.wikipedia.org/wiki/Database_transaction, A database transaction, by definition, must be atomic, consistent, isolated and durable). Can you provide some sources/examples to illustrate? – OlivierBlanvillain Sep 24 '13 at 19:25
  • 8
    Once I thought this too, it's a very common misconception :-) Atomic means, that either everything in the transaction is written or nothing is written. It does not mean that every operation in the transaction sees the same version of the data. What you mean is "isolation", and this is provided in several "flavors", depending which tradeoffs you are willing to make. Look at this Wikipedia article (http://en.wikipedia.org/wiki/Isolation_%28database_systems%29) and read the parts about "phantom reads" and "isolation levels". – stefan.schwetschke Sep 25 '13 at 07:23
  • [Great blog post about the problems with this matter](http://www.depesz.com/2012/06/10/why-is-upsert-so-complicated/) – mgttlinger Sep 11 '14 at 05:18