Best practice to implement a low latency live financial data feed using WCF?

Question

I have a .NET service which need to feed live financial data to its clients. The output rate for this feed might get intense and I am looking for the best architecture to implement this type of service with low latency and high performance.

I was thinking of using some kind of a stream data provider, one that is used for audio or video, but send feed updates instead.

Would appreciate any thought on this subject, or any real world examples

Update:

I don't have to use WCF, that was only my first approach since it is the current technology. Any other implementation in C# is welcome.

I am not sure that WCF is sutable for that. It is too thick. I recommend raw sockets + Protocol Buffers. Convenient and very fast. — Andrey, Apr 22 '11 at 17:17
@Andrey that is ***exactly*** how I would do it; there are several protocol buffers libraries available for c# — Marc Gravell, Apr 22 '11 at 17:24
@Marc Gravell♦ I became big fan of Protocol Buffers. Btw I use your implementation (protobuf-net) and it is so great! Thanks for it! :) — Andrey, Apr 22 '11 at 17:29
@Andrey thanks, can you point me to some case studies or documentation for this approach with C#? — Sol, Apr 22 '11 at 18:23
Did you have a look at Node.js and its implementation for pub-sub such as http://howtonode.org/redis-pubsub — Chandermani, Apr 22 '11 at 18:41
I thought if you are chasing some performance, you should not look at .net in the first place, the whole framework is thick. But again, it depends on how fast is fast. — Yuan, Apr 26 '11 at 20:45
@MarcGravell, So can I assume that protobuf-net is compatible with ordinary proto-buf? (like if a C# client is talking with a C++ server). — Jiminion, Jun 06 '14 at 17:32
@Jim yes the binary protocol is the binary protocol. Some features may be easier or harder on different implementations, of course. — Marc Gravell, Jun 06 '14 at 17:48
@MarcGravell, TY. Can they be driven by the same .proto file (assuming we are only doing simple stuff....)? Thanks again. — Jiminion, Jun 06 '14 at 17:50
@Jim yes, but for that you *might* find it easier to work from protobuf-csharp-port instead. — Marc Gravell, Jun 06 '14 at 18:01
We have achieved over 14 million messages per second using out .Net port of Aeron which can be found here : https://github.com/AdaptiveConsulting/Aeron.NET — Rich Linnell, Nov 18 '17 at 06:40

score 38 · Answer 1 · edited Jun 30 '14 at 16:49

Full Disclosure: I work for Informatica (formerly 29West) and am on the engineering team responsible for their messaging products. I am biased. I do, however, have a pretty good grasp of low-latency messaging in the financial market.

If you message rates are about 60 messages/sec. (as stated in a comment on Will Dean's answer), and they're being delivered to a GUI with a human sitting in front of it and reacting to the market at human-speed, it honestly doesn't matter a whole lot what software you use from a latency perspective. You might even be able to get away with using WCF (though I'd still recommend against it; we considered supporting it once and prototyped an adapter for it and it bloated latencies up by an order of magnitude - we decided not to bother with it at the time).

Now, Informatica's messaging software can pass messages between processes on the same machine in well under a microsecond, and if you want to buy some nice 10 gig-E NICs with kernel bypass or InfiniBand gear, you can pass millions of messages per second between machines with single-digit microseconds of latency. We'll also soon be releasing a new data serialization library that's supported in C/C++, Java, and .NET as part of the messaging product that in some cases is actually faster than Protocol Buffers (although Protocol Buffers are widely used and also a very good choice). Our .NET and Java APIs both have a feature called "ZOD" for "Zero Object Delivery", which is a kinda funny way of saying they generate no new objects during message delivery, meaning no garbage collection pauses & associated latency spikes/outliers. We've got another product called UMDS that's specifically designed to fan out high-speed backbone traffic to slower desktop apps without slowing down the backbone or other clients.

I could go on and on about how great Informatica's messaging software is and I do think it's worth checking out, but this already looks like a straight-up ad, and I'm an engineer, not a sales person. So here's a few pieces of more general advice:

If you have a lot of clients receiving the same data, you'll want some flavor of UDP multicast. You'll often want a reliable multicast transport of some kind - the well-known (and free) reliable multicast protocol is PGM. Windows includes an implementation of PGM that's usable in C#; I'll refer you to Mike Rettig's excellent blog post on how to use it if you want to try it out. (I happen to know Mike - he's a smart guy.) Protocol choice is an area in which you get what you pay for; Informatica's messaging includes a reliable multicast protocol loosely based off of PGM (our architect who designed it co-wrote the PGM RFC a long while back), but with a lot of major improvements. Plain PGM might be fine for what you need, though.
You want to go with a brokerless/serverless architecture. Have the apps communicate peer-to-peer with nothing in the middle. Avoid extra hops in the message path (which usually means avoid most JMS implementations, avoid almost anything with "queue" in the name somewhere, etc.).
Be mindful of how your system behaves when one individual client misbehaves. Can one slow consumer slow down everyone else?
There are a lot of OS tuning and BIOS tuning options that can benefit any sort of low-latency messaging, homegrown or bought - things like interrupt coalescing, tying NIC interrupts to a particular CPU core, receive-side scaling (which has historically been terrible when used with UDP on Windows, but should be getting much better in the future), disabling certain CPU power states, etc.
Resist the temptation to use built-in object serialization in .NET to send whole objects over the wire - it is orders of magnitude slower than using a simple binary format (like Protocol Buffers, or Informatica's serialization library, or your own binary format, etc.).

If you have more specific questions or need more detail on any of my advice, just let me know!

do you have any links to guidlines how to tune Windows Server 2008 and/or HP DL360p for HFT? except "Configuring the HP ProLiant Server BIOS for Low-Latency Applications". In particular what should I disable in Windows? Should I disable Windows firewall for example or it is not important? — Oleg Vazhnev, Oct 13 '12 at 07:14

Will Dean · Answer 2 · 2011-04-22T17:43:24.790

6

How low is 'low latency' and how busy is 'intense'? You need to have some idea of what you're aiming for to choose the right approach.

I could supply you some hardware which would respond to 100% of all requests within, say, 20us upto the full capacity of your network hardware, but it would not use WCF much at all.

To a very broad approximation, I would say that things like WCF are very high-level and trade-off ease-of-use and abstraction-for-the-benefit-of-the-programmer against performance (latency/throughput). Whether they trade it off too much for your application needs real numbers.

The lowest-latency, lowest-overhead IP-based protocol in widespread use is UDP - that's why it's used for things like DNS and NTP. It's very scalable at the server, because the server doesn't need to keep any state, and it's very simple to implement on almost any platform. But you do need to be thinking in terms of network packets rather than .NET objects. Do you get to supply the client-end software too?

edited Apr 22 '11 at 17:43

answered Apr 22 '11 at 17:18

Will Dean

39,055
11
90
118

2

Totally agree with your "broad approximation". But UDP is not ok for financial data, where it is unacceptable to loose data. TCP should be ok. – Andrey Apr 22 '11 at 17:20
2

I thought a lot of this super-low-latency financial data was just latest prices, so it was almost the canonical example of where it *was* ok to lose data, in that a new latest price would be along shortly. But if a TCP connection is held open, then it's no higher latency than a UDP one, though it's more expensive for the server. – Will Dean Apr 22 '11 at 17:38
1

@Will Dean and if it is feed of order requests? :) If TCP connection is held than latency will still be higher, because TCP ensures delivery, this has some overhead. – Andrey Apr 22 '11 at 17:55
Well, yes, you're really just illustrating my substantive point, which is that you can't choose a solution without knowing about the problem. I dispute the TCP latency though, unless you're talking about the small number of extra bytes of header, and the increased complexity of the stack, both of which are probably negligible (again, we can't actually know that). You would need the right TCP stack setup though, of course, so that it didn't hold things up unnecessarily at the sending end. – Will Dean Apr 22 '11 at 18:00
@Will Dean From wiki: "TCP provides both data integrity and delivery guarantee (by retransmitting until the receiver **acknowledges** the reception of the packet)." It means that TCP message is considered sent if you receive acknowledge. – Andrey Apr 22 '11 at 18:07
@Andrey, but the point about sliding window protocols like TCP is that until the window is full, the sender doesn't have to wait for the acknowledge before it can send a packet. The latency (by which I'm meaning the time taken to get a response to a request) is not impaired vs UDP unless you run out of window, which is probably just a sign that it wasn't big enough for your delay*bandwidth product. – Will Dean Apr 22 '11 at 18:13
@Will Dean - I get to supply the client software. I am talking about 40-60 messages per sec., depending on the number of instruments used by client. one instrument may be updated 4-5 times a sec, even more. What is the software solution you suggest? – Sol Apr 22 '11 at 18:21
2

I'd take Andrey's first suggestion from his comment on your question -assuming that by 'raw sockets' he means 'a straightforward TCP socket', rather than a real 'raw socket', which to me is naked IP. A fast serialisation scheme like one of the Protocol Buffers implementation seems like a great compromise between excessive abstraction and tedious byte twiddling. If you hold the TCP connection open, you shouldn't have much trouble with the performance you need. Look into the various async options with .NET sockets to avoid having one-thread-per-client, which is rarely a good solution. – Will Dean Apr 22 '11 at 20:40
@Will Dean - What about the streaming architecture? Wouldn't that be a good solution? The streamer keeps an open channel and is optimized for fast sending and processing. I can send the objects using Protobuff for fast serialization. Any thoughts? – Sol Apr 22 '11 at 21:15
Sol - I don't know anything about whatever type of streaming you're planning to use, so I don't know if it solves any of the problems you'd encounter with a simple TCP socket. Some streaming systems build-up very significant buffers in order to be able to mitigate jitter problems, so they then have terrible latency. – Will Dean Apr 22 '11 at 21:19
@Will you are right about latency. Last question - do you happen to know ZeroMQ? (http://www.zeromq.org/) - it seems as a good (& light) abstraction over simple sockets. – Sol Apr 22 '11 at 21:31
I don't know zeromq but it looks really interesting - thanks for that. – Will Dean Apr 22 '11 at 21:35

Teoman Soygul · Answer 3 · 2011-04-30T13:18:49.290

Live financial data? Never rely on WCF on that. Instead, go with what other industries use. i.e. NASDAQ uses Real-Time Innovations - Data Distribution Service to deliver live stock ticks to users. They provide C/C++/C# api for their communications libraries, which is extremely easy to setup and use (compared to WCF).

In general, this sort of real-time data feeds use publish/subscribe paradigm which helps to make sure that the communication happens with minimal overhead. This sort of an approach is the main idea in message-oriented middle ware and it is exactly what financial services use for real-time stuff.

On a side node, you can deliver real-time audio-video packets using RTI-DDS library, as far as I know, unmanned aerial vehicles like MQ-9 uses again this library to deliver live video & geo-location information to the ground control stations.

There are also free data distribution service libraries but I've no experience in them. You just need to google for it.

Edit: I'm currently prototyping some HMI (human machine interface) software which uses aforementioned RTI-DDS libraries along with two other libraries which have such message oriented architectures, which did work a thread up to now for all my real-time communication needs. Here is a demo: http://epics.codeplex.com/ (It will be used in remotely controlling the equipment in our brand new nuclear research facility)

I'm currently learning some basics of Tibco Randezvous messaging platform and this real time messaging is also provided on top of UDP protocol with publish subscribe message exchange pattern (except certified messaging where you demand guaranteed delivery). Many world wide banks are using Tibco platform. — Ladislav Mrnka, Apr 30 '11 at 10:57

score 3 · Answer 4 · answered May 01 '11 at 20:02

The more assumptions you make and features you cut out the faster you can make your system. The more robust and flexible you attempt to make things, the more your performance will suffer. I would suggest a few basic must haves:

A binary data serialization format. Don't use XML or any other human readable method of passing your data.
A robust enough data serialization format that it can support cross-architecture, cross-language endpoints. BER comes to mind - C# seems to have support
A transport protocol that has guaranteed delivery and data integrity. If any type of financial algorithm will be using this data, even missing one tick could mean the difference between and order being triggered or missing out on a price. Even if you are going to aggregate ticks in your server you still want control over how the information is presented to your clients. TCP works for distributed systems. However there are much faster alternatives if your clients are on the same machine as your server. UDP won't even garauntee order, which can be problematic (though not insurmountable).

With regard to internal processing:

Avoid strings and other classes that add significant overhead to simple tasks. Use basic character arrays instead. I'm not sure what options you have in C# or if you even have lightweight alternatives. If so, use them. This applies to data-structures as well.
Be aware of double/float comparison errors. Use comparisons that only check for the necessary level of precision. If possible convert everything to integers internally and provide enough metadata to convert back on the other end.
Use something similar to pooled allocators in C++. My lack of knowledge of C# prevents me from being more specific. Again C# probably isn't your best choice here. Bottom line is that you are going to be creating and destroying a lot of tick objects and there is no reason to ask the OS for the memory every time.
Only send out deltas, don't send information that your clients already have. This assumes you are using a transport with guaranteed delivery. If not you could end up displaying stale data for a long time.

score 1 · Answer 5 · answered Jun 01 '15 at 03:30

You ask specifically about a "low latency User Feed". What do you really want with low latency, for 'Feed Only' (and especially if it does not generate revenue), could the Users wait a second; that is not low latency.

If you want to trade FAST then you need to physically move across the street from the Exchange (or nearby with an Optical Link). Next you need to 'Trade on the Card'; the Ethernet Card is 'smart' and is fed 'Trade Formulas' that program the Network Card to make a preprogrammed trade based on Data received (without pestering your Computer).

See: http://intelligenttradingtechnology.com/article/groundbreaking-results-high-performance-trading-fpga-and-x86-technologies

Learning to manipulate that Environment will buy you more than reinventing the wheel.

Ultra low latency is costly, but billions are at stake; your stakes (and pursuit of lower latency) with be throttled by $.

score 1 · Answer 6 · edited May 23 '17 at 10:30

1

This might be of interest although its specific to gaming ... Lowest Latency small size data Internet transfer protocol? c#

Here is a tutorial on UDP connection http://www.winsocketdotnetworkprogramming.com/clientserversocketnetworkcommunication8r.html

Another Article on UDP http://msdn.microsoft.com/en-us/magazine/cc163648.aspx

edited May 23 '17 at 10:30

Community

1
1

answered Apr 27 '11 at 11:02

Jonathan

2,318
7
25
44

score 0 · Answer 7 · answered Apr 30 '11 at 10:50

In the past i've used Tibco rv or raw sockets for streaming prices/ rates, where high frequency updates are expected. In this situation, it is often the client (or in in fact the user) who is the limitation (as there is only so many updates a user can process), and this is therefore an example of where you can 'lose' data. In this situation a client side service broker can be used to throttle updates.

If the system is used for automated trading or HFT then products like 29West LatencyBuster has been proven to work well and offers guaranteed messaging.

Best practice to implement a low latency live financial data feed using WCF?

7 Answers7

Linked