0

I want to delete lots of keys from Redis. I have all key names "on hand", no need to search them. I consider 2 options:

  1. Use DEL command passing multiple keys
  2. Use lots of DEL commands in a single pipeline

I've done some performance testing on local machine and it appears that DEL with multiple keys (option 1) is almost 10 times faster than a pipeline.

What I'm worried about is that because of DEL being atomic, when I delete for example 10k keys with one command, it'll be much faster, but it'll block Redis during this single DEL command execution. On the opposite side, pipeline is slower but it does commands one-by-one, so no blocks for other clients of that same Redis.

I can't find a definite answer on whether DEL with multiple keys will block until all keys are deleted. My tests show that it doesn't block, but I don't understand why - it's kinda contradicts documentation.

Test code:

let data = await setSomeDataInRedis();
let keys = data.map(([k]) => k);
const startDel = Date.now();

console.log('\n== del[] ==')
// Sending 3 request using 2 connections to Redis in parallel.
// Well, almost in parallel 'cause it's Node.js and also network is serial.
await Promise.all([
  redisClient1.hmget(['always-there', 'name']) // I expect this one returns 1st as it comes before DEL
    .then((r) => console.log(`get-before-del[] - ${Date.now() - startDel}ms - ${JSON.stringify(r)}`)),
  redisClient2.del(keys)                       // I expect this one returns 2nd and block 3rd on Redis side
    .then(() => console.log(`del[] - ${Date.now() - startDel}ms`)),
  redisClient1.hmget(['always-there', 'name']) // I expect this one returns 3rd because 2nd blocks Redis
    .then((r) => console.log(`get-after-del[] - ${Date.now() - startDel}ms - ${JSON.stringify(r)}`)),
])

console.log('\n== pipeline ==')
// Same here - sending 3 request using 2 connections to Redis in parallel.
data = await setSomeDataInRedis();
const pipeline = redisClient2.pipeline();
data.forEach(([k]) => pipeline.del(k));
const startPipe = Date.now();
await Promise.all([
  redisClient1.hmget(['always-there', 'name']) // I expect this one returns 1st as it comes before DEL
    .then((r) => console.log(`get-before-pipeline - ${Date.now() - startPipe}ms - ${JSON.stringify(r)}`)),
  pipeline.exec()                              // I expect this one returns 3rd as it has lots of commands, but non-blocking
    .then(() => console.log(`pipeline - ${Date.now() - startPipe}ms`)),
  redisClient1.hmget(['always-there', 'name']) // I expect this one returns 2nd as pipeline is non-blocking
    .then((r) => console.log(`get-after-pipeline - ${Date.now() - startPipe}ms - ${JSON.stringify(r)}`)),
])

console.log('\n== unlink[] ==')
data = await setSomeDataInRedis();
keys = data.map(([k]) => k);
const startUnlink = Date.now();
await Promise.all([
  redisClient1.hmget('always-there', 'name')
    .then((r) => console.log(`get before unlink[] - ${Date.now() - startUnlink}ms - ${JSON.stringify(r)}`)),
  redisClient2.unlink(keys)
    .then(() => console.log(`unlink[] - ${Date.now() - startUnlink}ms`)),
  redisClient1.hmget(['always-there', 'name'])
    .then((r) => console.log(`get after unlink[] - ${Date.now() - startUnlink}ms - ${JSON.stringify(r)}`)),
])

Test output:

== del[] ==
get-before-del[] - 62ms - ["name0.06551250522960261"]
get-after-del[] - 64ms - ["name0.06551250522960261"]
del[] - 183ms

== pipeline ==
get-before-pipeline - 127ms - ["name0.9763696909778301"]
get-after-pipeline - 131ms - ["name0.9763696909778301"]
pipeline - 1325ms

== unlink[] ==
get before unlink[] - 41ms - ["name0.25533683439953236"]
get after unlink[] - 47ms - ["name0.25533683439953236"]
unlink[] - 153ms

The fact that get-after-del[] comes before del[] kinda shows that DEL with multiple keys is non-blocking (non-atomic), but this kinda contradicts Redis' docs.

Edit: I intentionally didn't add unlink in original version of the question as tests show that it's faster (10-20%), but unlink doesn't solve the initial problem - it'll still block Redis for significant amount of time. But given that the 1st answer is an unlink recommendation I'm adding this note and tests for unlink.

1 Answers1

1

whether DEL with multiple keys will block until all keys are deleted

Yes, it will block, unless your Redis (since Redis 4.0) is configured as lazyfree-lazy-user-del. Check this for more info.

My tests show that it doesn't block, but I don't understand why - it's kinda contradicts documentation.

First of all, check the config mentioned above. Secondly, your test is not accurate. Since it async calls, it depends on how and when the then part is scheduled. Also, it depends on whether the client has a connection pool to send commands to Redis. For example, the second hmget and del might be sent with different connections, and the second hmget might reach Redis earlier. I'm not familiar with your client, correct me, if I'm wrong.

Also, you'd better use unlink instead of del, when you have many keys to be deleted. Check this for detail.

for_stack
  • 21,012
  • 4
  • 35
  • 48
  • "it depends on whether the client has a connection pool" - I have 2 connections to Redis `redisClient1` and `redisClient2`. "Secondly, your test is not accurate" - I agree that tests might be not 100% accurate (even without node.js, just because networks is involved), but given that 2nd `hmget`'s `then` is triggered 140ms earlier than expected it kinda shows my point. I'd appreciate ideas for better tests, even in other languages. "Also, you'd better use `unlink` instead of `del`" - I added a comment about unlink in post edit. It has the same behaviour as `del` which I can't explain. – Viktor Molokostov Oct 09 '22 at 09:33
  • You can use a single connection with sync interface to do the test. As I mentioned above, if you have multiple connections, e.g. a connection pool, the two `hmget` calls might be sent with connection A, `del` or `unlink` might be sent with connection B, and both `hmget` calls are sent before `del` or `unlink`. – for_stack Oct 10 '22 at 01:54