6

I have been reading a lot of the posts on here and surfing the web, but maybe I am not asking the right question. I know that Redis is currently Master/slave until Cluster becomes available. However, I was wondering if someone can tell me how I would want to configure Redis logistically to meet my needs (or if its not the right tool).

Scenerio:

we have 2 sites on opposite ends of the US. We want clients to be able to write at each site at a high volume. We then want each client to be able to perform reads at their site as well. However we want the data to be available from a write at the sister site in < 50ms. Given that we have plenty of bandwidth. Is there a way to configure redis to meet our needs? our writes maximum size would be on the order of 5k usually much less. The main point is how can i have2 masters that are syncing to one another even if it is not supported by default.

DvideBy0
  • 678
  • 2
  • 12
  • 27

3 Answers3

11

It's about 19ms at the speed of light to cross the US. <50ms is going to be hard to achieve.

http://www.wolframalpha.com/input/?i=new+york+to+los+angeles

Joshua Martell
  • 7,074
  • 2
  • 30
  • 37
11

The catch with Tom's answer is that you are not running any sort of cluster, you are just writing to two servers. This is a problem if you want to ensure consistency between them. Consider what happens when your client fails a write to the remote server. Do you undo the write to local? What happens to the application when you can't write to the remote server? What happens when you can't read from the local?

The second catch is the fundamental physics issue Joshua raises. For a round trip you are talking a theoretical minimum of 38ms leaving a theoretical maximum processing time on both ends (of three systems) of 12ms. I'd say that expectation is a bit too much and bandwidth has nothing to do with latency in this case. You could have a 10GB pipe and those timings are still extant. That said, transferring 5k across the continent in 12ms is asking a lot as well. Are you sure you've got the connection capacity to transfer 5k of data in 50ms, let alone 12? I've been on private no-utilization circuits across the continent and seen ping times exceeding 50ms - and ping isn't transferring 5k of data.

How will you keep the two unrelated servers in-sync? If you truly need sub-50ms latency across the continent, the above theoretical best-case means you have 12ms to run synchronization algorithms. Even one query to check the data on the other server means you are outside the 50ms window. If the data is out of sync, how will you fix it? Given the above timings, I don't see how it is possible to synchronize in under 50ms.

I would recommend revisiting the fundamental design requirements. Specifically, why this requirement? Latency requirements of 50ms round trip across the continent are usually the sign of marketing or lack of attention to detail. I'd wager that if you analyze the requirements you'll find that this 50ms window is excessive and unnecessary. If it isn't, and data synchronization is actually important (likely), than someone will need to determine if the significant extra effort to write synchronization code is worth it or even possible to keep within the 50ms window. Cross-continent sub-50ms latency data sync is not a simple issue.

If you have no need for synchronization, why not simply run one server? You could use a slave on the other side of the continent for recovery-only purposes. Of course, that still means that best-case you have 12ms to get the data over there and back. I would not count on 50ms round trip operations+latency+5k/10k data transfer across the continent.

The Real Bill
  • 14,884
  • 8
  • 37
  • 39
  • 1
    I want to say thank you for taking the time to write all of that up. I think the question that was raised was maybe done so just to ask something impossible(not sure) in order to take away from the idea of using something new. The issue is that we have multiple locations and want users to write to a master closest to them, then have a process to ensure that there is real time(as quickly as possible) consistency between both sites. Do you have a suggestion that may fit this scenario? – DvideBy0 Oct 10 '11 at 18:06
  • 1
    I would ask which is the more important aspect, near-side writes, or near-side reads. The fastest possible will be single-master and local read slaves. Are you read-intensive or write-intensive, or about 50-50? If you have the data center sites already I would put a single master up and run tests. Multi-master, especially cross-continent is far form simple. Fortunately, in most cases it isn't actually needed. I would set up a test set and explore it validate the case. Don't focus on multimaster if there is a simpler more robust way. I suspect 1Master and distributed slaves will work for you. – The Real Bill Oct 10 '11 at 19:59
2

This is probably best handled as part of your client - just have the client write to both nodes. Writes generally don't need to be synchronous, so sending the extra command shouldn't affect the performance you get from having a local node.

Tom Clarkson
  • 16,074
  • 2
  • 43
  • 51