2

I'm working with a large SQL Server database, and that's built upon the idea of a counter table for primary key values. Each table has a row in this counter table with the PK name and the next value to be used as a primary key (for that table). Our current method of getting a counter value is something like this:

BEGIN TRAN 
UPDATE CounterValue + 1 
SELECT Counter Value 
COMMIT TRAN 

That works mostly well since the process of starting a transaction, then updating the row, locks the row/page/table (the level of locking isn't too important for this topic) until the transaction is committed.

The problem here is that if a transaction is held open for a long period of time, access to that table/page/row is locked for too long. We have situations where hundreds of inserts may occur in a single transaction (which needs access to this counter table).

One attempt to address this problem would be to always use a separate connection from your application that would never hold a transaction open. Access to the table and hence the transaction would be quick, so access to the table is generally available. The problem here is that the use of triggers that may also need access to these counter values makes that a fairly unreasonable rule to have. In other words, we have triggers that also need counter values and those triggers sometimes run in the context of a larger parent transaction.

Another attempt to solve the problem is using a SQL Server app lock to serialize access to the table/row. That's Ok most of the time too, but has downsides. One of the biggest downsides here also involves triggers. Since triggers run in the context of the triggering query, the app lock would be locked until any parent transactions are completed.

So what I'm trying to figure out is a way to serialize access to a row/table that could be run from an application or from a SP / trigger that would never run in the context of a parent transaction. If a parent transaction would roll back, I don't need the counter value to roll back. Having always available, fast access to a counter value is much more important than loosing a few counter values should a parent transaction be rolled back.

I should point out that I completely realize that using GUID values or an identity column would solve a lot of my problems, but as I mentioned, we're talking about a massive system, with massive amounts of data that can't be changed in a reasonable time frame without a lot of pain for our clients (we're talking hundreds of tables with hundreds of millions of rows).

Any thoughts about the best way to implement such a counter table would be appreciated. Remember - access should be always available from many apps, services, triggers and other SPs, with very little blocking.

EDIT - we can assume SQL Server 2005+

Ron
  • 141
  • 3
  • 6
  • What version of SQL Server are you using? – Gordon Linoff Feb 05 '14 at 19:31
  • 2
    SQL Server 2012 finally supports sequences. I'm pretty sure that is more robust and faster that anything that is built outside of the engine. So if you are not on SQL Server 2012 it might be cheaper to upgrade than to implement and maintain a homegrown solution. –  Feb 05 '14 at 19:33
  • You can still use an identity column without changing your other tables. Something like `CREATE TABLE counter (counter INT IDENTITY(1, 1) PRIMARY KEY );`, and then `INSERT INTO counter DEFAULT VALUES; SELECT @@SCOPE_IDENTITY; DELETE FROM counter WHERE counter = @@SCOPE_IDENTITY;`, perhaps? –  Feb 05 '14 at 19:33
  • @Gordon Linoff - SQL Server 2005+. Updated my OP. – Ron Feb 05 '14 at 19:39
  • @hvd - The only way I could see that changing my situation would be to have such a counter table for each of my data tables. Otherwise, I think I'm right back into that situation of transactions blocking access to it. Unless I'm perhaps miss understanding your suggestion. Also, @@SCOPE_IDENTITY isn't supported in 2005. – Ron Feb 05 '14 at 19:45
  • @Ron. . . . SQL Server 2012 introduces sequences, which might solve your problem. Alas. – Gordon Linoff Feb 05 '14 at 19:48
  • @Ron Sorry, I meant `SCOPE_IDENTITY()` (it already existed in SQL Server 2000), I got the syntax mixed up with that of `@@IDENTITY`. Too late to edit, unfortunately. I was thinking that `INSERT INTO counter DEFAULT VALUES` shouldn't need a table lock, so two transactions should be able to insert simultaneously. I'm not 100% certain of that, though. –  Feb 05 '14 at 19:50

2 Answers2

1

The way the system currently works in unscalable. You have noticed that yourself. Here are some solutions in rough order of preference:

  1. Use an IDENTITY column (You can set the IDENTITY property without rebuilding the table. Search the web to see how.)
  2. Use a sequence
  3. Use Hi-Lo ID generation (What's the Hi/Lo algorithm?). In short, consumers of IDs (application instances) check out big ranges of IDs (like 100) in a separate transaction. The overhead of that scheme is very low.

Working with the constraints from your comment below: You can achieve scalable counter generation even with a single transaction and no application-level changes. This is kind of a last resort measure.

Stripe the counter. For each table, you have 100 counters. The counter N tracks IDs that conform to ID % 100 = N. So each counter tracks 1/100th of all IDs.

When you want to take an ID, you take it from a randomly chosen counter. The chance is good that this counter is not in use by a concurrent transaction. You will have little blocking due to row-level locking in SQL Server.

You initialize counter N to N and increment it by 100. This ensures that all counters generate distinct ID ranges.

Counter 0 generates 0, 100, 200, .... Counter 1 generates 1, 101, 201, .... And so on. A disadvantage of this is that your IDs now are not sequential. In my opinion, an application should not rely on this anyway because it is not a reliable property.

You can abstract all of this into a single procedure call. code complexity will actually not that much bigger. You basically just generate an additional random number and change the increment logic.

Community
  • 1
  • 1
usr
  • 168,620
  • 35
  • 240
  • 369
  • yes, I completely agree that our current scheme doesn't scale well! 1. Identity columns will be a problem, not just for converting data, but for application level logic that's not expecting it. It would require us to change a lot of code here. 2. Yes! Yes! That looks like exactly what we need. The problem is it's only supported in SQL Server 2012, right? 3. I don't think this one will work well for us. Consumers typically don't need large blocks and it would also require us to make massive application level changes. – Ron Feb 05 '14 at 19:50
  • @Ron I have added an idea that would work with the (extreme) design constraints you impose. I hope I have explained it well enough. – usr Feb 05 '14 at 20:17
0

One way is to get and increment the counter value in one statement:

DECLARE @NextKey int

UPDATE Counter
SET @NextKey = NextKey + 1,
    NextKey = @NextKey
D Stanley
  • 149,601
  • 11
  • 178
  • 240
  • This does not address the scaling problems because access to the counter table is still serialized. – usr Feb 05 '14 at 20:13
  • @D Stanley - I think that's a more elegant way to write my query, but it unfortunately doesn't solve the problem of running in a parent transaction from a trigger. Since the data is dirty and running in a transaction, it'll still lock up the table/row. – Ron Feb 05 '14 at 20:15