2

[PSQLException: ERROR: duplicate key value violates unique constraint "dictionary_word_idx" Detail: Key (word)=(odirane) already exists.]

I have unique index preventing any duplications. I wonder how to InsertAll an Array with thousands elements but only the new ones? I'm using Slick 1.0.1 and Postgresql 9.1

Edit: I'm trying the following:

    def run = {
      val source = scala.io.Source.fromFile("/home/user/dev/txt/test1.txt")
      val lines = source.mkString
      source.close()

      val words = lines.split("[^\\p{Ll}]").distinct

      database withTransaction {

        val q = for {
            w <- words.toList
            row <- Dictionary if row.word != w  
        } yield w


        Dictionary.autoInc.insertAll(q: _*)
      }


      words.length
    }

but t dosent compile:

 polymorphic expression cannot be instantiated to expected type; 
 [error]  found   : [G, T]scala.slick.lifted.Query[G,T] 
 [error]  required: scala.collection.GenTraversableOnce[?] [error]          
  row <- Dictionary if row.word != w

Edit 2:

case class Word(id: Option[Long], word:String)

object Dictionary extends Table[Word]("dictionary") {
  def id = column[Long]("id", O.PrimaryKey, O.AutoInc)
  def word = column[String]("word")

  def * = id.? ~ word <> (Word, Word.unapply _)
  def dictionary_word_idx = index("dictionary_word_idx", word, unique = true)
  def autoInc = word returning id
}
Roskoto
  • 1,722
  • 1
  • 16
  • 27

2 Answers2

2

Another alternative is to write raw SQL. Postgres doesn't have a default way to on duplicate ignore, but you can emulate it in a few different ways, shown here https://dba.stackexchange.com/questions/30499/optimal-way-to-ignore-duplicate-inserts

Combine that with http://slick.typesafe.com/doc/1.0.0-RC2/sql.html

Edit:

Here's an example

def insert(c: String) =
    (Q.u + """INSERT INTO dictionary
        (word)
    SELECT""" +?  c + 
    """WHERE
        NOT EXISTS (
            SELECT word FROM dictionary WHERE word = """ +? c + ")"
    ).execute

val words = lines.split("[^\\p{Ll}]")

words.foreach(insert)

Is that what you mean by "at once"? I think that's going to be the most performant way of doing this without being crazy.

If it's too slow for you, there's another suggestion of creating a temporary table without the unique constraint, copy your current table into the temp table, insert the new words into the temp table, and then select distinct out of that table. That's shown here: https://stackoverflow.com/a/4070385/375874

But I think that's WAY overkill. Unless you have some crazy requirements or something.

Community
  • 1
  • 1
Falmarri
  • 47,727
  • 41
  • 151
  • 191
  • Could you please add example how to construct a single INSERT with WHERE NOT EXISTS that inserts (or not if duplicated) at once all new values from a Seq[String] ? – Roskoto Aug 09 '13 at 05:02
1

Conceptually:

def insertAll[T](items: Seq[T]): Seq[Either[(T, Exception), (T, Int)]] = items.map { i =>
  try {
    // Perform an insert supposing returns and int representing the PK on the table
    val pk = …
    Right(i, pk)
  } catch {
    case e: Exception => Left(i, e)
  }
}

You perform each insert operation and then, based on the result, you return a Left or Right object that keep tracks of the end result and give you a detailed context to interpret the operation.

EDIT

Let's suppose that your DAO object looks like:

object Dictionary extends Table[Word]("dictionary") {
   // ...
}

where Word is your object model and moreover you have provided the nuts and bolts (as I can deduce from your pasted code) it should be (where words is a Seq[Word]):

words.map { w => 
  try {
    Right(w, Dictionary.autoInc.insert(w))
  } catch {
    case e: Exception => Left(w, e)
  }
}

What you get is a sequence of Either that encapsulates the outcome for further processing.

Considerations The solution provided by me attempts optimistically to perform the operation against the DB without requiring to pre-filter the list based on the state of the DB. In general pre-filtering is problematic in an heavily multiuser application provided you can't assume that nobody added a word in your pre-filtered list after you've performed the filter. State more simply: uniqueness constraint is a robust feature provided by DBMS which is better to exploit than to reinvent. The solution you edited above is a no-solution because you still need to face possibly PK violation exception.

Lord of the Goo
  • 1,214
  • 15
  • 31