4

I have an Object A that has 20 fields, and I have 1 million instances of Object A.

I want to store them in Redis, and I have to retrieve more than 100 instances every time.

Now, I have two solutions

  1. Serialize Object A's instance in json, and store it use SET key json
  2. Store each field into a hash field, every instance corresponds to a hash map. As I said before, I have to retrieve more than 100 instances every time. So if I use redis hash, I have to call pipeline with hundreds of HGETALL key, it will be very slow. So I want to know if I can improve the speed by using hash?
jiluo
  • 2,296
  • 3
  • 21
  • 34
  • possible duplicate of [Redis strings vs Redis hashes to represent JSON: efficiency?](http://stackoverflow.com/questions/16375188/redis-strings-vs-redis-hashes-to-represent-json-efficiency) – dh1tw May 05 '14 at 12:06

2 Answers2

3

This is going to depend on a few things:

  1. How much RAM is available to your Redis server
  2. How many objects you want to store
  3. The size of the serialized objects and their fields

Storing the serialized object will take up much more space because you have all the extra language-specific information stored along with the raw data. If you're short on RAM or have to store a large amount of objects, it's probably going to be best to store all this data in a hash. Since you have 1 million rows, you'll probably save quite a bit of space by using hashes.

I recently ran into a very similar issue. At first I tried storing the serialized objects in the Redis DB, but I had to store over 5 million objects and each object contained a lot of excess data that I didn't need to store in the DB. This resulted in a bloated DB size and wasted a good amount of RAM.

Your requirements will probably differ from mine quite a bit, so it's best to benchmark it yourself. Try serializing an object and see how big the result is. Compare that to the size of the sum of the keys and compare the size difference between the two. If the serialized object isn't much bigger, it might be best to just serialize. It's also good to keep in mind that unserializing isn't a negligible operation.

Jeff
  • 6,643
  • 3
  • 25
  • 35
2

Strings and hashes are very different datatypes and both have advantages and disadvantages. In the case you're exposing, you're missing a few very important things; for example:

  • How often are you going to write the keys?
  • Do you need to retrieve 100 items to aggregate them? Wouldn't make sense to store a hash with the aggregated values in this case, so the retrieval is done only once?
  • Is this a heavy-write or heavy-read application?
  • Will you need to write atomically? In other words, can 2 simultaneous requests update the same value in Redis? If so, how are you going to guarantee that storing a JSON string won't cause race conditions?

Taking a decision based solely in performance is not a good idea, you might be missing some other important aspects such as integrity, scalability and maintainability.

Nico Andrade
  • 880
  • 5
  • 16