0

I have a two node system where I'm trying to replicate in-memory state between the nodes... To simplify, just think Master-Slave (Active-Passive). Node A has a relatively constant flow of changes coming in, and then tries to push the state to Node B. Currently, this occurs on a periodic schedule by batching the state and pushing it with an instance of a TcpClient.

The current TcpClient process seems like it's somewhat inefficient. Is this a way that I can establish a link between two systems and stream state from one to the next for as long as the channel is established.

For performance reasons I can't use anything like WCF or Remoting... I'm relatively inexperienced with lower level networking constructs, but I'm perfectly willing to try anything new. Ideally, the solution would be something I can accomplish with native C# 4 and not need any new products.

JoeGeeky
  • 3,746
  • 6
  • 36
  • 53
  • From your description it is unclear what the problem is. If TCP is not efficient, how did you measure it and what your requirements are? – Zepplock Dec 20 '11 at 21:00
  • If you've coded it correctly, TCP should be able to push data as fast as your line can handle. One issue you may have with TCP is aggressive back-off on packet loss, but that shouldn't be an issue with an intranets. – Bengie Dec 20 '11 at 21:05
  • @Zepplock Sorry... it's my inexperience showing. I did not mean to say the "TCP" was inefficient. I meant to say that the TcpClient process seemed inefficient. For example, on the sender side, I keep new'ing up a new instance of the client for each transmission. I assumed there was a way to establish on link and then keep using it for a longer period of time. Is that more clear? – JoeGeeky Dec 20 '11 at 21:05
  • @JoeGeeky Just stop doing that. TCP normally establishes a link when you tell it and keeps it up until you close it. – David Schwartz Dec 20 '11 at 21:07
  • @Bengie Should I consider other protocols? At this point, I expect the demands on the system to increase, so more capacity is always welcome. If yes, what protocols should I look into? – JoeGeeky Dec 20 '11 at 21:07
  • I don't know everything about your program, but why batch state changes and instead just stream committed changes asynchronously? It would help relieve instantaneous batch transfers causing bursty traffic. – Bengie Dec 20 '11 at 21:10
  • @DavidSchwartz So once I have the stream and start calling `stream.Write`, don't let go of it? Just use it until it dies? – JoeGeeky Dec 20 '11 at 21:12
  • I guess, the problem(if any) is related to your code not to TCP – L.B Dec 20 '11 at 21:14
  • 1
    @JoeGeeky: I would check your link saturation during syncs. I would think the amount of data may be too great for your link. Depending on your application, you could have a lot of state and it's just too much for whatever NIC you're using. List of things I'm curious about: #1) NIC utilization during state transfer #2) total data transfered #3) Link speed. We need to find where the bottleneck is. TCP shouldn't be a problem. It's about the best you'll get, and highly hardware accelerated on server/workstation hardware. – Bengie Dec 20 '11 at 21:14
  • @Bengie The process is asynchronous, but my limited understanding of how the TcpClient worked led me to follow a pattern of... queue state, create a new TcpClient instance, send, and repeat. You are right about the bursty traffic, when I saw that is led to this question. – JoeGeeky Dec 20 '11 at 21:16
  • @L.B Thanks... I assumed so, hence the question. – JoeGeeky Dec 20 '11 at 21:17
  • @Bengie All good questions although I don't have the answers just yet. I suspect there is plenty of NIC & Wire capacity, and my handling of this pattern is all wrong. With that said, I'll try & profile the service and see what it uncovers. – JoeGeeky Dec 20 '11 at 21:20
  • 1
    @JoeGeeky: I incorrectly used the word "asynchronously". What I meant is to buffer/queue state changes and push them immediately, which seems to be what you're saying in your last comment. Typically, "Batching" involves waiting for a certain amount of data or a time to elapse. While you still want your changes to be atomic and need to batch/group certain states, you don't want to batch the groups of states. I would use a Producer Consumer pattern with a queue that pushes atomic groups of state immediately. Leave the connection open if you can. Last Q. What's the time lapse between pushes? – Bengie Dec 20 '11 at 21:22
  • @Bengie You are absolutely correct; I was waiting a configured amount of time which felt totally wrong. What I wanted to do was very similar to what you stated. In the end, it wasn't clear to me whether I could have the channel that was always open... or at least open longer than one transmission cycle. – JoeGeeky Dec 20 '11 at 21:27
  • @JoeGeeky: You'll probably want to use a Queue and have a dedicated thread polling the queue. That thread both manages the TCP connection and pushing the data. As soon as data hits the queue, that thread will immediately push to the consumer/slave. Queues are lock free when you have one reader and writer, but if you plan to have multiple threads writing to the queue, you will need to wrap it in a lock, but only for writing. You should only have one thread reading and that's the one pushing the data over TCP. – Bengie Dec 20 '11 at 21:36
  • @JoeGeeky, So you know the problem is in your code and expect a solution without showing any code. -1 – L.B Dec 20 '11 at 21:47
  • @L.B Ouch, point taken... My Bad! – JoeGeeky Dec 20 '11 at 21:49

2 Answers2

2

TCP is a lowish level connection based streaming protocol. It seems the right choice.

about the only problem you get with TCP is latency issues.

You could use UDP, but from the sounds of it, you'd end up emulating TCP over UDP.

Keith Nicholas
  • 43,549
  • 15
  • 93
  • 156
2

You're right to use TCP. You should make the connection and hold it open until you're all done. You can't really beat it for speed.

If the changes are not too large, it is best to send only delta changes, not whole copies of the data. The way I've done this in the past is to use Differental Execution. That may be more than you want to attempt, but the basic idea is you write a function to walk over the data. Think of it as a simultaneous serializer/deserializer. Under that control structure it automatically detects all changes since the previous time it walked. You can grab those changes and send them to a receiver on the other end that works the same way, but folds in the changes.

That may be a bit much for what you're doing, but regardless, TCP is the way to go. BTW, try to avoid new-ing. That's costly.

Community
  • 1
  • 1
Mike Dunlavey
  • 40,059
  • 14
  • 91
  • 135
  • Excellent! I have to read more on *Differental Execution*, but this sounds like a go'er. I'm trying to do some testing now. Thanks. – JoeGeeky Dec 20 '11 at 21:52