1

I have an app (nodejs / express) that needs to find a routing rule to apply based on time of day and day of week.

So for example, I have the following business rules:

  • on Mondays and Tuesdays between 09:00 GMT and 12:00 GMT, I need route object ABC to "location x".
  • on Tuesdays between 13:00 and 13:30 I need to route ABC to "location y".

( For the purposes of this discussion, it really doesn't matter what object ABC is. )

I am debating between two options, as far as how I should design my keys in my REDIS database

Option 1

Make the day information a part of the object data, like this:

HMSET routing_rules:objectABC_09:00_12:00 days 'mon tues' location X 
HMSET routing_rules:objectABC_13:00_13:30 days 'tues' location Y

Advantage of this method - When it's time to update the days list I can simply do this:

HMSET routing_rules:objectABC_09:00_12:00 days 'mon tues thu'

The downside here is that in order to find the right rule, I have to make two queries.... First do a SCAN command to find the right time range... and then if there's a match... do another query to find the days value.

Option 2

Include the day of week information as a part of the key

HMSET routing_rules:objectABC_09:00_12:00_mt location X 
HMSET routing_rules:objectABC_13:00_13:30_t location Y

I would use a naming convention like

m = monday
t = tuesday
w = wed
r = thursday 
etc.

The advantage to option 2 is that in order to find the right routing rule based on current time and day, I only need to run one SCAN command (we can assume that my SCAN command will return all results in one shot)

But the downside to option 2 is that when I need to add a new day to the key, I think i need to delete the key and value... and then recreate it. Is this correct?

And as of right now, the only way I know how to delete is by doing an HDEL for each value in the object, and then the key is deleted. So for example, I've been doing something like this:

  127.0.0.1:6379> HDEL routing_rules:objectABC_00:00_00:00 days 'mon tues' location x

where I have to list all the values in the object to delete the entire key / value pair. In this example, it's not so bad because I only have two values for this key - the location and the days fields. But if there was a lot more data, it'd be a little cumbersome. And I'm not sure if there are other considerations to factor in besides the number of fields associated with this key.

If you have any suggestions on how to design this key for optimal performance and maintenance, I'm all ears. The way I see it, there is no way to avoid running the scan at least once. But this is my first redis database attempt so I apologize in advance for remedial questions / noob mistakes.

EDIT 1

Assuming that I have enough memory and assuming that I only have to save one field / value per key, let's say I create my keys like this:

 SET routing_rules:objectABC_09:00_12:00_m  X 
 SET routing_rules:objectABC_09:00_12:00_t  X 
 SET routing_rules:objectABC_13:00_13:30_t  Y

And now a request comes in for object ABC and it's Monday at UTC 11. Because my keys represent start times and end times (aka a range ), I don't see how I can find the right key / value pair without doing a scan.

Am I missing something?

Happydevdays
  • 1,982
  • 5
  • 31
  • 57

2 Answers2

0

I would not use any SCAN command in this case (and in most cases). You probably would have to invoke it multiple times to scan the whole key space, while there are other alternatives providing a direct access to the data you are looking for -which is what makes a K/V store performant.

By example, with your first solution, put all values in a hash, and get all routes in one request with HGETALL. Then, you will have to iterate on the values in your application to select the right one.

Another solution, which do not require any iteration on application side, is to create a route per day and per hour range:

SET routing_rules:objectABC_09:00_12:00_m location X SET routing_rules:objectABC_13:00_13:30_t location Y ...

Then in one GET request you have the value you are looking for. Adding a day does just require a SET. The drawback compared to your solution is memory usage: it multiples entries. You didn't give any clue about number of entries, but if it's very high, it could be a problem. To reduce the memory required, you can start by using shorter key names, like r:objectABC_09001200m instead of routing_rules:objectABC_09:00_12:00_m).

Update

Given the fact the time ranges does not seem the be constant, and assuming there is no algorithm to deduce the time range from the current time, the first solution, based on using hash, seems to be better than he second, based on GET/SET. But I would name the fields according to the time ranges:

HSET routing_rules:objectABC 09:00_12:00 X HSET routing_rules:objectABC 12:00_12:30 Y

Then I would get all Fields for a given object using HGETALL routing_rules:objectABC and iterate over the member keys to find the right one.

Pascal Le Merrer
  • 5,883
  • 20
  • 35
  • Pascal, what would be considered a "high" value for number of entries? – Happydevdays Oct 03 '16 at 12:54
  • It depends on the memory size of each key+value, compared to the available memory on the server. – Pascal Le Merrer Oct 03 '16 at 13:02
  • I think I will have about 15 GB to use... and 50,000 objects with each one having a max of 5 routing rules. – Happydevdays Oct 03 '16 at 13:20
  • Try to define the approximative size of a key + its value, then multiply it by 50000 and compare it to 15GB to know if you can use the second solution I described. – Pascal Le Merrer Oct 03 '16 at 13:44
  • Pascal does your suggestion to use SET change in view of this article? http://stackoverflow.com/questions/13557075/redis-set-vs-hash - It actually seems that both answers got similar number of upvotes... – Happydevdays Oct 03 '16 at 17:15
  • The challenge with using SET / GET is that I when i'm looking for a rule that applies to the current date & time, since the keys all have a time range baked in, I will always have to do some sort of a generic query first to find the matching one. Please see EDIT 1 – Happydevdays Oct 03 '16 at 17:30
  • The question you linked does not change my poit of view. Your update, on the contrary does. I'll update my answer to take it in account. – Pascal Le Merrer Oct 03 '16 at 21:59
0

I would go with a sorted set solution, a set for each object, the value should be the location*, and the score should be the minute in the week this rule expires.

e.g. (week starts at Monday 00:00)

on Mondays and Tuesdays between 09:00 GMT and 12:00 GMT, I need route object ABC to "location x".

Monday 12:00 => 720 Tuesday 12:00 => 2160

ZADD ABC_rules 720 x 2160 x

There are two issues here, 1st your example shows times that there are no rules, so this must be taken into account. 2nd and more major, set objects must be unique, and x cannot be stored twice. Both together are the * reason above, the way to solve it is to apped/prepend the value with the minute in the week the rule starts:

Monday 9:00 => 540 Tuesday 9:00 => 1980

ZADD ABC_rules 720 x:540 2160 x:1980

To query, all you need is to use ZRANGEBYSCORE with the minute in the week, and make sure the time that you get appended to the location is before the time you sent.

Query for Monday 10:00 (600):

ZRANGEBYSCORE ABC_rules 600 +inf LIMIT 1

The result will be x:540 and since 540 is lower than 600 you know x is a valid answer.

Query for Monday 13:00 (780):

ZRANGEBYSCORE ABC_rules 780 +inf LIMIT 1

The result will be x:1980, and since 1980 is larger than your query (780) this result is invalid and you should take your default routing (or whatever your solution is to the unmapped times in your schedule).

To deleting rules you must remove the location appended with the start time:

ZREM ABC_rules x:540

You can also use the ZRANGEBYSCORE to get all rules that apply in a specific day, and you can write a LUA script that clears them.

Ofir Luzon
  • 10,635
  • 3
  • 41
  • 52