Why KEYS is advised not to be used in Redis?

Question

In Redis, it is advised not to use KEYS command. Why it is so? Is it because its time complexity is O(N) ? Or something else is the reason.

http://stackoverflow.com/questions/32603964/scan-vs-keys-performance-in-redis/32604218#32604218 — Matías Fidemraizer, Jan 09 '17 at 21:03

score 4 · Answer 1 · answered May 19 '21 at 20:32

I did the following experiment to prove how dangerous KEYS command is.

While one command with KEYS runs, others KEYS commands are waiting for the time to run. One run of the KEYS command has 2 phases, first is to get the information from Redis, second is to send it to the client.

$ time src/redis-cli keys "*" | wc -l
1450832
real    0m17.943s
user    0m8.341s


$ src/redis-cli
127.0.0.1:6379> slowlog get
1) 1) (integer) 0
   2) (integer) 1621437661
   3) (integer) 8321405
   4) 1) "keys"
      2) "*"

So, it was running on Redis for 8s and then was piped to 'wc' command. Redis finished with this command in 8s but 'wc' command needed that data for 17s to complete the couting. So the memory buffers had to be there for at least 17s. Now, let's imagine clients on the network, where this data has to go to the clients as well. If we have 10 keys commands, that will run on Redis one by one, when the first one finishes and next one runs the results of the first command has to be stored in memory before the client will consume them. That all takes memory, so I can imagine a situation, where 5th client is running KEYS command but we still need to keep the data for the first client, because they were still not trasfered through the network.

Let's test it out.

Scenario: Let's have Redis DB with 200M size (1000M physical memory) and check how much memory is one execution of KEYS takes, and how long when done through the network. Then simulate 5 same KEYS commands to be run and see if it kills Redis.

$ src/redis-cli info memory
used_memory_human:214.17M
total_system_memory_human:926.08M

When run from the same node:
$ time src/redis-cli keys "*" | wc -l
1450832
real    0m17.702s
user    0m8.278s

$ free -m
              total        used        free      shared  buff/cache   available
Mem:            926         301         236          24         388         542
Mem:            926         336         200          24         388         507
Mem:            926         368         168          24         388         475
Mem:            926         445          91          24         388         398
Mem:            926         480          52          24         393         363
Mem:            926         491          35          24         399         352
-> looks like it consumed 190M for the KEYS command

-> So, Redis is busy with the command for 8s, but the memory is consumed for this command for 17s. -> running just one KEYS command just blocks the Redis for 8s, but does not cause the OOM

Let's run 2 KEYS commands at the (almost) same time (that will run one after another anyway)

$ time src/redis-cli keys "*" | wc -l &
$ time src/redis-cli keys "*" | wc -l &

$ free -m
              total        used        free      shared  buff/cache   available
Mem:            926         300         430          24         194         546
Mem:            926         370         361          24         194         477
Mem:            926         454         276          24         194         393
Mem:            926         589         141          24         194         258
Mem:            926         693          37          24         194         154
-> now we used 392M memory for 26s, while Redis is hung for 17s
-> but we still have a running Redis

Let's run 3 KEYS commands at the (almost) same time (that will run one after another anyway)

$ time src/redis-cli keys "*" | wc -l &
$ time src/redis-cli keys "*" | wc -l &
$ time src/redis-cli keys "*" | wc -l &

$ free -m
              total        used        free      shared  buff/cache   available
Mem:            926         299         474          23         152         549
Mem:            926         385         388          23         152         463
Mem:            926         512         261          23         152         336
Mem:            926         573         200          23         152         275
Mem:            926         711          61          23         152         136
Mem:            926         842          21          21          62          17
-> now we used 532M memory for 36s, while Redis is hung for 26s
-> but we still have a running Redis

Let's run 4 KEYS commands at the (almost) same time (that will run one after another anyway)
$ time src/redis-cli keys "*" | wc -l &
$ time src/redis-cli keys "*" | wc -l &
$ time src/redis-cli keys "*" | wc -l &
$ time src/redis-cli keys "*" | wc -l &
-> that kills Redis

Nothing in the Redis logs:

2251:C 19 May 16:03:05.355 * DB saved on disk
2251:C 19 May 16:03:05.379 * RDB: 2 MB of memory used by copy-on-write
1853:M 19 May 16:03:05.432 * Background saving terminated with success

In /var/log/messages

May 19 16:08:01 consumer2 kernel: [454881.744017] redis-cli invoked oom-killer: gfp_mask=0x6200ca(GFP_HIGHUSER_MOVABLE), nodemask=(null), order=0, oom_score_adj=0
May 19 16:08:01 consumer2 kernel: [454881.744180] [<8023bdb8>] (oom_kill_process) from [<8023c6e8>] (out_of_memory+0x134/0x36c)

Conclusion:

we can kill healthy Redis instance, consuming 200M of RAM, where 70% RAM free on OS by just running 4 KEYS commands issued one after another and run one after another. Just because the results has to be buffered even after Redis is finished with executing them.
one is unable to protect Redis against that behavior with maxmemory, as the memory usage is not a result of SET command

score 2 · Accepted Answer · answered Jan 09 '17 at 06:19

2

Yes.

Time complexity is very bad. Note that the N in O(N) refers to the total number of keys in the database, not the number of keys being selected by the filter pattern. So this can be a really big number for a production database.

And even worse, since only one command can run at the same time (Redis not being multi-threaded), everything else will have to wait for that KEYS to complete.

answered Jan 09 '17 at 06:19

Thilo

257,207
101
511
656

Okay. But SMEMBERS command can be used even if it is also O(N). Because here N is just the number of keys in the set (not in entire database) and the maximum value of N can be 2^32-1, which is not going to be an issue. Am I correct? – Jsmith Jan 09 '17 at 06:29
Yes, unless that set is also huge (you will need to rethink the design long before you reach 2^32 set members). – Thilo Jan 09 '17 at 06:34
Thank you. You are saying that redis can execute only one command at a time as it is single threaded. But by its by time multiplexing won't it handle other commands concurrently? So it is not necessary that the other command has to wait until KEYS command is complete. Please refer: http://stackoverflow.com/questions/10489298/redis-is-single-threaded-then-how-does-it-do-concurrent-i-o – Jsmith Jan 09 '17 at 06:38
Moreover the issue (waiting until a command is complete) you are saying will happen only in case of EVAL/running a Lua script I guess. – Jsmith Jan 09 '17 at 06:39
I think you are misunderstanding the linked article. All Redis commands are executed sequentially. http://stackoverflow.com/a/17099452/14955 – Thilo Jan 09 '17 at 06:42

Why KEYS is advised not to be used in Redis?

2 Answers2

Linked