3

I need to check if a unique (not primary key) field is present in a table, if it isn't - insert it:

ID unqiueidentifier, --PK
name varchar(100),
otherID int --Unique

Perhaps using a conditional insert:

DECLARE @OtherID int
INSERT INTO TableA (OtherID)
SELECT @OtherID
WHERE NOT EXIST (SELECT * from TableA where OtherID = @OtherID)

SELECT MainID from TableA where OtherID = @OtherID

ref: Whilst locked - see if otherID exists, if so return mainID, if not insert otherID & return new mainID

Which kind of lock / hint would I require to ensure a concurrent process doesn't insert the same OtherID in the time between checking and inserting (apparently concurrency is still an issue even in a single statement such as the above).

ref: http://kejser.org/race-condition-when-creating-unique-values/

ref: http://weblogs.sqlteam.com/dang/archive/2007/10/28/Conditional-INSERTUPDATE-Race-Condition.aspx

I have found a lot of information on HOLDLOCK / UPDLOCK. I understand how they could put a hold on SELECTING or UPDATING existing rows. What doesn't make sense is how they can put a lock on INSERTING new rows without basically just locking the entire table - aka TABLOCK.

How can you lock a row that doesn't exist?! Or would UPDLOCK just lock the entire table above because you are dong a SELECT *?

If it does put a lock on the entire table - that seems like it could be very disruptive and bad for scaling?! Perhaps I just enforce a UNIQUE constraint and allow the occasional exception?

I can't see any good options here?!

Community
  • 1
  • 1
niico
  • 11,206
  • 23
  • 78
  • 161
  • Why not just try to insert and handle when a unique key violation is raised? – jean May 05 '17 at 20:27
  • Or using TRANSACTION and COMMIT/ROLLBACK – Jacob H May 05 '17 at 20:28
  • @jean I was thinking about that - though exceptions are expensive apparently? Also, I don't like to design things expecting exceptions - perhaps that instinct is wrong?! – niico May 05 '17 at 20:29
  • 2
    What proportion of the time are you expecting it to fail? See http://stackoverflow.com/questions/3407857/only-inserting-a-row-if-its-not-already-there/3408196#3408196 – Martin Smith May 05 '17 at 20:30
  • @JacobH Transactions would require (UPDLOCK, HOLDLOCK) I think - it's an option. – niico May 05 '17 at 20:31
  • It will probably fail Maybe 1/1000 or less - so I don't want to slow every operation down for a relatively rare occurrence – niico May 05 '17 at 20:31
  • 1
    Regarding holdlock it locks the **range** assuming a suitable index exists between the keys either side of the prospective insert. – Martin Smith May 05 '17 at 20:33
  • @MartinSmith I don't really follow what you mean - can you provide an example? thx. – niico May 05 '17 at 20:44
  • 1
    @niico Good example and explanation here: [Key-Range Locking - MSDN](https://technet.microsoft.com/en-us/library/ms191272(v=sql.105).aspx) – SqlZim May 05 '17 at 20:46
  • 1
    Also see http://michaeljswart.com/2010/04/transaction-phenomena-part-4-serializable-vs-snapshot/ – Martin Smith May 05 '17 at 21:22
  • Thanks - plenty of reading. (gotta love these drive by down votes?!). – niico May 05 '17 at 21:23
  • Just take care with **HINTS** because the engine is not obligated to follow them (they are just hints not commands) and can escalate up the lock causing hard-to-find-the-cause nasty problems. Yes exceptions are demanding and you can take in mind how much duplicate inserts can happen. With a 1/1000000 ration the exception will not be a problem, see @MartinSmith comments – jean May 06 '17 at 12:15

2 Answers2

3

You have two options here.

A pessimistic approach (expecting the worst) would be to use serializable isolation level: SET TRANSACTION ISOLATION LEVEL SERIALIZABLE, which would prevent concurrent transactions from inserting conflicting data (phantom rows) while you are looking for and possibly inserting your unique value. This has the cost of significantly reducing concurrency, as other transactions will have to wait while you finish both your lookup and insert.

An optimistic approach (hoping for the best) would be to simply insert your ID and catch and handle the exception if it occurs. This offers higher concurrency but makes your code a bit more complex.

Which method to choose depends on your performance requirements and the probability of conflicts; if you expect conflicts to be rare or need highest possible concurrency, the optimistic approach will be preferable. Otherwise you'll have to suffer the concurrency consequences of the pessimistic approach to ensure data integrity.

niico
  • 11,206
  • 23
  • 78
  • 161
mustaccio
  • 18,234
  • 16
  • 48
  • 57
3

This is a good reference for upserts, but it can be applied to this situation as well: Sam Saffron upsert approach

To aquire a key range lock(so as not to lock the whole table), you will need an index with OtherId as the key, prefereably unique, to use with (updlock, serializable) (equivalent to with (updlock, holdlock)).

declare @OtherId int;

insert into TableA (OtherId)
select @OtherId
where not exists (
  select OtherId 
  from TableA with (updlock, serializable) 
  where OtherId = @OtherId
)

select Mainid 
from TableA 
where OtherId = @OtherId;

Reference:

niico
  • 11,206
  • 23
  • 78
  • 161
SqlZim
  • 37,248
  • 6
  • 41
  • 59
  • thx. I have an index on otherID with a WHERE clause (cause I need to allow multiple nulls) - will this still work?: CREATE UNIQUE NONCLUSTERED INDEX idx_yourcolumn_notnull ON YourTable(yourcolumn) WHERE yourcolumn IS NOT NULL; – niico May 05 '17 at 20:47
  • @niico if the query uses the index, then it should work. – SqlZim May 05 '17 at 20:51
  • Is it preferable to use serializable rather than holdlock? (I know they are identical but seems the former is more fashionable?). If the query doesn't use the index, would it just lock the whole table or just allow duplicates? Seems a little fragile relying on the index? – niico May 05 '17 at 21:00
  • 1
    @niico it would lock the table of not able to acquire a range lock. It is not relying on the index so much as taking advantage of the index. – SqlZim May 05 '17 at 21:08
  • Are you sure I need updlock? I'm not updating anything? – niico May 05 '17 at 21:17
  • @niico the role of `UPDLOCK` Is to prevent deadlock. If both took S locks on the range then neither would be able to insert until one was rolled back. Two sessions can't hold U locks on the same resource so one has to wait. – Martin Smith May 05 '17 at 21:20
  • Why do you need SERILIZABLE if you have UPDLOCK?! – niico May 05 '17 at 21:37
  • 1
    @niico `serializable` is the foundation that makes the key range lock work (RangeS-S lock), and keeps them untli the end of the transaction. We are using `updlock` to acquire update locks instead of shared locks. Please read [this article that explains it all step by step](https://samsaffron.com/blog/archive/2007/04/04/14.aspx) – SqlZim May 08 '17 at 15:41