3

I am developing a web application (Nginx+PHP7.3) that will use Redis database to store some data (mainly to count things) and I have to decide how to store the data. What is important in my case is speed and performance and also keep operations per second low to be able to handle many concurrent connections with the web application.

Option 1: Store JSON data on a single key

To save the data I would use a single SET operation, i.e:

$redis->set("MyKey-UserID", '{"clicks":123,"downloads":1234,"views":123}');

Then to update the data I would use two operations (GET + SET), i.e:

$array = json_decode($redis->get("MyKey-UserID"), true);

$array['clicks']++;
$array['downloads']++;
$array['views']++;

$redis->set("MyKey-UserID", json_encode($array));

Option 2: Multiple keys with single value

To save the data I would use multiple SET operations, i.e:

$redis->set("MyKey-UserID-Clicks", 123);
$redis->set("MyKey-UserID-Downloads", 1234);
$redis->set("MyKey-UserID-Views", 123);

Then to update the data I would use multiple INCR operations, i.e:

$redis->incr("MyKey-UserID-Clicks");
$redis->incr("MyKey-UserID-Downloads");
$redis->incr("MyKey-UserID-Views");

My Selected Option + Questions

Personally I would use Option 1, what are your opinions?

Do you think it will still be fast using GET + SET as using INCR?

What do you think about Option 2?

My Pros/Cons for Option 1

Option 1 Pros:

  • Better database organization as I will have only one key for each user
  • With a single GET operation I will have all JSON data
  • To update all JSON fields I will just use two operations (GET + SET)
  • Database will be smaller in file size

Option 1 Cons:

  • To increment just "Clicks" I need two operations (GET + SET) instead of one INCR
  • Maybe Option 1 procedure (GET+SET) is slower than multiple INCR in Option 2?

Some Useful Answers

@Samveen (Link)

Option 1 is not a good idea if concurrent modification of the JSON payload is expected (a classic problem of non-atomic read-modify-write)

We have many concurrent connections, so maybe Option 2 is the winner.

Community
  • 1
  • 1
user2972081
  • 573
  • 1
  • 4
  • 15

2 Answers2

8

I'm adding my answer after taking suggestions from @TheDude

Option 3: Using Hashes (The Winner)

To save the data I would use one hMSet, i.e:

$redis->hMSet('MyKey-UserID', array('clicks' => 123, 'downloads' => 123, 'views' => 123));

Then to update all fields I would use multiple hIncrBy, i.e:

$redis->hIncrBy('MyKey-UserID', 'clicks', 2);
$redis->hIncrBy('MyKey-UserID', 'downloads', 2);
$redis->hIncrBy('MyKey-UserID', 'views', 2);

With this method, I can have one hash (MyKey-UserID) and then I add custom fields.

So the DB will still be small (compared to Option 2) and concurrent writes will be fine (compared to Option 1).

According to phpredis I could also use multi() to run multiple commands within one operation:

A Redis::MULTI block of commands runs as a single transaction

https://github.com/phpredis/phpredis#multi-exec-discard

So I could update more than one field or all fields like this, doing only one operation:

$ret = $redis->multi()
     ->hIncrBy('MyKey-UserID', 'clicks', 2)
     ->hIncrBy('MyKey-UserID', 'downloads', 2)
     ->hIncrBy('MyKey-UserID', 'views', 2)
     ->exec();

Hashes vs SET/GET (key=value) datatype

According to this answer: https://stackoverflow.com/a/24505485/2972081

Use hashes when possible

Small hashes are encoded in a very small space, so you should try representing your data using hashes every time it is possible. For instance if you have objects representing users in a web application, instead of using different keys for name, surname, email, password, use a single hash with all the required fields.

I made some benchmarks and here are the results:

hset myhash rand_string rand_int: 31377.47 requests per second
hget myhash rand_string: 30750.31 requests per second
hincrby myhash rand_string: 30312.21 requests per second
set rand_string: 30703.10 requests per second
get rand_string: 30969.34 requests per second
incrby rand_string: 30581.04 requests per second

The command I used for the benchmark is this:

redis-benchmark -n 100000 -q hset myhash rand_string rand_int

So Hashes are as fast as Get/Set (strings).

user2972081
  • 573
  • 1
  • 4
  • 15
4

If you need to update individual fields and re-save it, option 1 isn't ideal because it doesn't handle concurrent writes properly.

You should be able to use as HASH in Redis, and use HINCRBY to increment individual keys in the hash. Couple that with a pipeline, and you would only make one request to Redis when updating multiple keys.

You can use HGETALL to get all of the key/value pairs in the hash.

TheDude
  • 3,796
  • 2
  • 28
  • 51
  • This seems an interesting approach, do you know if there is any performance/speed difference compared to using GET/SET/INCR or is the same? – user2972081 Feb 13 '19 at 16:29
  • I would imagine it would have comparable performance, but I can't point you to an exact answer. – TheDude Feb 13 '19 at 16:37