1

Objective (using Redis):

  1. Cache data (as an object: example dict of string/pandas dataframe) by date
  2. Retrieve data between date range (say 20101010 - 20151231)
  3. Combine attributes of data between this date range (I'd like Redis to do this for me than combine it in python logic)

Example:

20101010: { "a" : dataframe_A1, "b" : "dataframe_B1 }, 
20101011: { "a" : dataframe_A2, "b" : "dataframe_B2 }

Output:

{"a" : dataframe_A1 concatenated with dataframe_A2, "b": dataframe_B1 concatenated with dataframe_B2}

One solution I've found is to use ZADD/ZRANGEBYSCORE technique, where I create an index on date in YYYYMMDD format, giving it a natural set ordering. This makes it trivial to retrieve data in a range (example: 20101010 - 20151231) (using: https://redis.io/topics/indexes)

Is there a way retrieve data in a combined format from Redis efficiently (#3) - see my example output above.

What sequence of operations/commands would most efficiently get me this information?

Note:

Thanks

labheshr
  • 2,858
  • 5
  • 23
  • 34
  • I'm considering using Redis to store a time series as well, I'd use redis MULTI/EXEC or lua to do the ZRANGEBYSCORE which contain unique keys to HASH keys where you could do HGETALL to retrieve the object. You could also store the object/dataframe as a JSON string, but that might be too much overhead depending on the complexity of the object and how efficient python's JSON parser is. – nak Nov 07 '17 at 05:26
  • i implemented the above solution...but combining data from various dates in python is still slow (example: pandas.concat(..) for multiple dataframes) – labheshr Nov 07 '17 at 12:59

0 Answers0