9

I created a stored procedure to implement rate limiting on my API, this is called about 5-10k times a second and each day I'm noticing dupes in the counter table.

enter image description here

It looks up the API key being passed in and then checks the counter table with the ID and date combination using an "UPSERT" and if it finds a result it does an UPDATE [count]+1 and if not it will INSERT a new row.

There is no primary key in the counter table.

Here is the stored procedure:

USE [omdb]
GO
/****** Object:  StoredProcedure [dbo].[CheckKey]    Script Date: 6/17/2017 10:39:37 PM ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
ALTER PROCEDURE [dbo].[CheckKey] (
@apikey AS VARCHAR(10)
)
AS
BEGIN

SET NOCOUNT ON;

DECLARE @userID as int
DECLARE @limit as int
DECLARE @curCount as int
DECLARE @curDate as Date = GETDATE()

SELECT @userID = id, @limit = limit FROM [users] WHERE apiKey = @apikey

IF @userID IS NULL
    BEGIN
        --Key not found
        SELECT 'False' as [Response], 'Invalid API key!' as [Reason]
    END
ELSE
    BEGIN
        --Key found
        BEGIN TRANSACTION Upsert
        MERGE [counter] AS t
        USING (SELECT @userID AS ID) AS s
        ON t.[ID] = s.[ID] AND t.[date] = @curDate
        WHEN MATCHED THEN UPDATE SET t.[count] = t.[count]+1
        WHEN NOT MATCHED THEN INSERT ([ID], [date], [count]) VALUES (@userID, @curDate, 1);
        COMMIT TRANSACTION Upsert

        SELECT @curCount = [count] FROM [counter] WHERE ID = @userID AND [date] = @curDate

        IF @limit IS NOT NULL AND @curCount > @limit
            BEGIN
                SELECT 'False' as [Response], 'Request limit reached!' as [Reason]
            END
        ELSE
            BEGIN
                SELECT 'True' as [Response], NULL as [Reason]
            END
    END
END

I also think some locks are happening after introducing this SP.

The dupes aren't breaking anything, but I'm curious if it's something fundamentally wrong with my code or if I should setup a constraint in the table to prevent this. Thanks

Update 6/23/17: I dropped the MERGE statement and tried using @@ROWCOUNT but it also caused dupes

BEGIN TRANSACTION Upsert
UPDATE [counter] SET [count] = [count]+1 WHERE [ID] = @userID AND [date] = @curDate
IF @@ROWCOUNT = 0 AND @@ERROR = 0
INSERT INTO [counter] ([ID], [date], [count]) VALUES (@userID, @curDate, 1)
COMMIT TRANSACTION Upsert
bfritz
  • 2,416
  • 2
  • 20
  • 29
  • 1
    I wouldn't use a MERGE statement for this. It hides intent in this case. use an explicit transaction and separate select/insert/update – Mitch Wheat Jun 18 '17 at 05:45
  • @MitchWheat I was hoping using SQL's newly introduced "UPSERT" method would be more efficient than doing an UPDATE and checking the @@rowcount but it seems to be the opposite. – bfritz Jun 18 '17 at 10:03
  • why can't you set the id as a unique key for the table – Sagar V Jun 23 '17 at 13:57
  • @SagarV because the counter tracks multiple days in the same table. – bfritz Jun 23 '17 at 14:12
  • but the ID should be unique I think – Sagar V Jun 23 '17 at 14:12
  • @SagarV If the ID was unique I could only have one row per ID and would have no way to track their current request limit unless I reset everything to zero at midnight. And I'm also trying to keep some historic data for weekly reports. – bfritz Jun 23 '17 at 14:15
  • Is the `count` indicates currently used or total limit for each user? – Sagar V Jun 23 '17 at 14:18
  • @SagarV It's a total count per day i.e. Daily Rate Limit – bfritz Jun 23 '17 at 14:23
  • Read this answer, part 2: https://stackoverflow.com/questions/35261411/how-to-get-the-next-number-in-a-sequence/35418227#35418227. Adjust for your key being ID + date. – TT. Jun 23 '17 at 14:27
  • @TT. that looks a lot like the solutions posted here: http://michaeljswart.com/2011/09/mythbusting-concurrent-updateinsert-solutions/ that I'm currently testing, but many have deadlock issues. – bfritz Jun 23 '17 at 14:32
  • @bfritz If you keep the transaction small with just locking the counter table, deadlocks should not be an issue. However I've never tested this for a load of 50k-100k/sec so I can't say if that is doable. – TT. Jun 24 '17 at 08:52

4 Answers4

7

A HOLDLOCK hint on the update statement will avoid the race condition. To prevent deadlocks, I suggest a clustered composite primary key (or unique index) on ID and date.

The example below incorporates these changes and uses the SET <variable> = <column> = <expression> form of the SET clause to avoid the need for the subsequent SELECT of the final counter value and thereby improve performance.

ALTER PROCEDURE [dbo].[CheckKey]
    @apikey AS VARCHAR(10)
AS

SET NOCOUNT ON;
--SET XACT_ABORT ON is a best practice for procs with explcit transactions
SET XACT_ABORT ON; 

DECLARE
      @userID as int
    , @limit as int
    , @curCount as int
    , @curDate as Date = GETDATE();

BEGIN TRY;

    SELECT
          @userID = id
        , @limit = limit 
    FROM [users] 
    WHERE apiKey = @apikey;

    IF @userID IS NULL
    BEGIN
        --Key not found
        SELECT 'False' as [Response], 'Invalid API key!' as [Reason];
    END
    ELSE
    BEGIN
        --Key found
        BEGIN TRANSACTION Upsert;

        UPDATE [counter] WITH(HOLDLOCK) 
        SET @curCount = [count] = [count] + 1 
        WHERE
            [ID] = @userID 
            AND [date] = @curDate;

            IF @@ROWCOUNT = 0
            BEGIN    
                INSERT INTO [counter] ([ID], [date], [count]) 
                    VALUES (@userID, @curDate, 1);
            END;

        IF @limit IS NOT NULL AND @curCount > @limit
        BEGIN
            SELECT 'False' as [Response], 'Request limit reached!' as [Reason]
        END
        ELSE
        BEGIN
            SELECT 'True' as [Response], NULL as [Reason]
        END;

        COMMIT TRANSACTION Upsert;

    END;

END TRY
BEGIN CATCH
    IF @@TRANCOUNT > 0 ROLLBACK;
    THROW;
END CATCH;
GO
Dan Guzman
  • 43,250
  • 3
  • 46
  • 71
  • This looks promising! I was just about to try something very similar "UPDATE [counter] WITH (UPDLOCK, HOLDLOCK)". I like the SET @curCount trick to reduce the subsequent SELECT but if it's only occurring during the UPDATE wouldn't it be NULL during the first call/insert? I guess I could declare 1 as a default value to resolve that. – bfritz Jun 23 '17 at 19:27
  • An UPDLOCK isn't needed because the update will acquire an exclusive lock on the updated row or an exclusive range lock when the row doesn't exist. The HOLDLOCK is needed to avoid releasing the key range lock when no row is updated. I'll add error handling to the sample code. – Dan Guzman Jun 23 '17 at 19:40
  • Are you intentionally leaving out the @@ROWCOUNT check before the INSERT? I already had a clustered index on [ID] and [date] but it wasn't unique and I just fixed that, so no more dupes will occur, now it's just about optimization. – bfritz Jun 23 '17 at 20:31
  • @bfritz, I accidentally removed @@ROWCOUNT when I removed @@ERROR. Sorry about that. – Dan Guzman Jun 23 '17 at 20:36
2

Probably not the answer you're looking for but for a rate-limiting counter I would use a cache like Redis in a middleware before hitting the API. Performance-wise it's pretty great since Redis would have no problem with the load and your DB won’t be impacted.

And if you want to keep a history of hits per api key per day in SQL, run a daily task to import yesterday's counts from Redis to SQL.

The data-set would be small enough to get a Redis instance that would cost literally nothing (or close).

drkbrd
  • 41
  • 1
  • Something before/fronting SQL-server seem like the way to go (whether Redis or something else). A really smart cache could even batch inputs from the few API-keys that are power-users. – Darius X. Jul 02 '17 at 12:03
1

It will be the merge statement getting into a race condition with itself, i.e. your API is getting called by the same client and both times the merge statement finds no row so inserts one. Merge isn't an atomic operation, even though it's reasonable to assume it is. For example see this bug report for SQL 2008, about merge causing deadlocks, the SQL server team said this is by design.

From your post I think the immediate issue is that your clients will be potentially getting​ a small number of free hits on your API. For example if two requests come in and see no row you'll start with two rows with a count of 1 when you'd actually want one row with a count of 2 and the client could end up getting 1 free API hit that day. If three requests crossed over you'd get three rows with a count of 1 and they could get 2 free API hits, etc.

Edit

So as your link suggests you've got two categories of options you could explore, firstly just try and get this working in SQL server, secondly other architectural solutions.

For the SQL option I would do away with the merge and consider pre-populating your clients ahead of time, either nightly or less often for several days at a time, this will leave you a single update instead of the merge/update and insert. Then you can confirm both your update and your select are fully optimised, ie have the necessary index and that they aren't causing scans. Next you could look at tweaking locking so you're only locking at the row level, see this for some more info. For the select you could also look at using NOLOCK which means you could get slightly incorrect data but this shouldn't matter in your case, you'll be using a WHERE which targets a single row always as well.

For the non-SQL options, as your link says you could look at queuing things up, obviously these would be the updates/inserts so your selects would be seeing old data. This may or may not be acceptable depending on how far apart they are although you could have this as an "eventually consistent" solution if you wanted to be strict and charge extra or take off API hits the next day or something. You could also look at caching options to store the counts, this would get more complex if your app is distributed but there are caching solutions for this. If you went with caching you could choose to not persist anything but then you'd potentially give away a load of free hits if your site went down, but you'd probably have bigger issues to worry about then anyway!

Community
  • 1
  • 1
Matt
  • 12,569
  • 4
  • 44
  • 42
  • I'm not that concerned with free hits per se, but I do need a concurrent solution that doesn't slow down/lock other transactions. I found some interesting results about high volume concurrent updates/inserts here: http://michaeljswart.com/2011/09/mythbusting-concurrent-updateinsert-solutions/ – bfritz Jun 23 '17 at 15:00
0

At a high level, have you considered pursuing the following scenario?

Restructuring: Set the primary key on on your table to be a composite of (ID, date). Possibly even better, just use the API Key itself instead of the arbitrary ID you're assigning it.

Query A: Do SQL Server's equivalent of "INSERT IGNORE" (it seems there are semantic equivalents for SQL Server based on a Google search) with the values (ID, TODAY(), 1). You'll also want to specify a WHERE clause that checks the ID actually exists in your API/limits table).

Query B: Update the row with (ID, TODAY()) as its primary key, setting count := count + 1, and in the very same query, do an inner join with your limits table, so that in the where clause you can specify that you'll only update the count if the count < limit.

If the majority of your requests are valid API requests or rate-limited requests, I would perform queries in the following order on every request:

Run Query B.
If 0 rows updated:
 Run query A.
 If 0 rows updated:
  Run query B.
  If 0 rows updated, reject because of rate limit.
  If 1 rows updated, continue.
 If 1 rows updated:
  continue.
If 1 row updated:
 continue.

If the majority of your requests are invalid API requests, I'd do the following:

Run query A.
 If 0 rows updated:
  Run query B.
  If 0 rows updated, reject because of rate limit.
  If 1 rows updated, continue.
 If 1 rows updated:
  continue.
Kurtiss Hare
  • 61
  • 1
  • 1