2

I have a table with a text column that I want a uniqueness guarantee. Since I can't put a unique constraint on text column, I want to use transactions to surround my inserts to ensure the uniqueness guarantee.

My question is: what the minimum isolation level is required for this guarantee?

Glide
  • 20,235
  • 26
  • 86
  • 135
  • 1
    Voted to close this question as a duplicate. And anyway, no transaction behavior enforces uniqueness, so you were pursuing the wrong strategy anyway. – Bill Karwin Oct 18 '16 at 19:58
  • 2
    @BillKarwin I don't think this question is a duplicate of the one so marked: the answers to the other questions provide alternate solutions (which, granted, should be used), but nevertheless does not answer the question as asked. To answer the OP, `SERIALIZABLE` is the highest you can go, but still not sufficient to guarantee uniqueness in *all* situations. See http://kejser.org/race-condition-when-creating-unique-values/ – bishop Oct 18 '16 at 19:59
  • @BillKarwin This is not a duplicate as the one you posted. This question is asking about the isolation level needed to make text unique. The other asks about creating a unique constraint on a text column. – Glide Oct 20 '16 at 16:18

1 Answers1

4

There is no usage of transactions that ensures uniqueness.

Use a UNIQUE constraint for that.

But you can't make any index, including a unique index, that is over 768 bytes. Which means you can't put a unique constraint on a long string data type like TEXT, VARCHAR over a certain length, etc.

You can use a prefix index to make a leading portion of a long string unique.

ALTER TABLE MyTable ADD UNIQUE KEY (textfield(255));

Another method of ensuring uniqueness is to search the existing rows for the value you want to insert, and only if it is not found, insert the new value.

But this requires that you have exclusive access to the table, to prevent a race condition with another concurrent session trying to insert the same value.

This also has nothing to do with transaction isolation level. Even the SERIALIZABLE isolation level won't help you avoid the race condition. Read http://dev.mysql.com/doc/refman/5.7/en/innodb-transaction-isolation-levels.html which says:

[SERIALIZABLE] is like REPEATABLE READ, but InnoDB implicitly converts all plain SELECT statements to SELECT ... LOCK IN SHARE MODE if autocommit is disabled.

So if multiple sessions concurrently try to read the table to check for a duplicate value, they are allowed to, because shared locks don't block other shared locks. Then they may all decide it's safe to insert the new value, and you have duplicates.

To get an exclusive lock on reading the table, you'd use SELECT ... FOR UPDATE explicitly. But you can do this in any transaction isolation level.

Bill Karwin
  • 538,548
  • 86
  • 673
  • 828
  • Just making sure I understand: `SELECT unique_text_column FOR UPDATE` can tell me if the unique text definitely exists or not? If so, wouldn't I need to be in a transaction to ensure the `INSERT unique_text into table` statement runs with no other sessions inserting the same unique text? – Glide Oct 20 '16 at 20:24
  • Yes, you need to do the SELECT and the subsequent INSERT in a single transaction, or else the lock created by the SELECT FOR UPDATE will be released. But this has nothing to do with isolation level. – Bill Karwin Oct 20 '16 at 20:46
  • But wouldn't the isolation level determine what the other session sees? That is, session A starts trans, `select unique_text_for update`, `INSERT unique_text into table`. Session B starts transaction and does the same thing. Would setting the isolation level in B's transaction `READ_UNCOMMMITED` vs `Serializable` give different results? That is, `READ_UNCOMMITED` wouldn't see A's insert. – Glide Oct 20 '16 at 21:56
  • Once session A has taken an exclusive lock on a row (or the gap where that row would be), session B *cannot* acquire a lock for `SELECT FOR UPDATE` on the same row, because A has the lock. This works the same regardless of any transaction isolation level in either session. – Bill Karwin Oct 21 '16 at 15:36
  • That makes senses, thanks. I just want to confirm my understand since a lot have been said. If all the sessions use SELECT and a subsequent INSERT in a transaction for inserting into the text column, that would guarantee uniqueness on that column. Is that correct? – Glide Oct 21 '16 at 18:42
  • Only if you acquire an exclusive lock with `SELECT... FOR UPDATE`. If you don't lock on the SELECT, then you still have a race condition, no matter what transaction isolation level you use. – Bill Karwin Oct 21 '16 at 18:59