I ran into the same issue, only using Redis 2.8.24, but also using it for API rate limiting.
I suspect you are doing the rate limiting like this (using Ruby code just for the example):
def consume_rate_limit
# Fetch the current limit for a given account or user
rate_limit = Redis.get('available_limit:account_id')
# It can be nil if not already initialized or if TTL has expired
if rate_limit == nil
# So let's just initialize it to the initial limit
# Let's use a window of 10,000 requests, resetting every hour
rate_limit = 10000
Redis.setex('available_limit:account_id', 3600, rate_limit - 1)
else
# If the key already exists, just decrement the limit
Redis.decr('available_limit:account_id')
end
# Return true if we are OK or false the limit has been reached
return (rate_limit > 0)
end
Well, I was using this approach and found out there's a cocurrency problem between the "get" and the "decr" call which leads to the exact issue you described.
The issue happens when the TTL of the rate-limit key expires just after the "get" call but before the "decr" call. What will happen:
First the "get" call will return the current limit. Let's say it returned 500.
Then in just a matter of some fraction of miliseconds, the TTL of that key expires, so it doesn't exist anymore in Redis.
So the code continues to run and the "decr" call is reached. Also the bug is reached here:
The decr documentation states (my emphasis):
Decrements the number stored at key by one. If the key does not
exist, it is set to 0 before performing the operation. (...)
As the key has been deleted (because it has expired), the "decr" instruction will initialize the key to zero and then decrement it, which is why the key value is -1. And the key will be created without a TTL, so issuing a TTL key_name
will also issue -1.
The solution for that might be to wrap all that code inside a transaction block using MULTI and EXEC commands. However, that might be slow because it requires multiple round-trips to the Redis server.
The solution I've used was to write a Lua script and run it using the EVAL command. It has the advantage of being atomical (which means no concurrency issues) and has only one RTT to the Redis server.
local expire_time = ARGV[1]
local initial_rate_limit = ARGV[2]
local rate_limit = redis.call('get', KEYS[1])
-- rate_limit will be false when the key does not exist.
-- That's because redis converts Nil to false in Lua scripts.
if rate_limit == false then
rate_limit = initial_rate_limit
redis.call('setex', KEYS[1], initial_rate_limit, rate_limit - 1)
else
redis.call('decr', KEYS[1])
end
return rate_limit
To use it, we could rewrite the consume_rate_limit
function to this:
def consume_rate_limit
script = <<-LUA
... that script above, omitting it here not to bloat things ...
LUA
rate_limit = Redis.eval(script, keys: ['available_limit:account_id'], argv: [3600, 10000]).to_i
return (rate_limit > 0)
end