0

Common way to keep clients in sync with server in real time is to make Websocket/SSE connection and push all updates this way. This is obviously very efficient, but also requires us too set up a server to handle all those persistent connections and to communicate with the rest of our infrastructure.

While I was looking into video streaming solutions, I learned that current way to go there is to put your data in form of static files, allow clients request whatever and whenever they need and let highly optimized servers like nginx do the rest for you.

So I started thinking if this could be also the way to go with message communication. Just put all data you want your clients to have fresh and synced into form of static files and set up nginx to serve them. Taking advantage of things like http/2, memcached, last-modified tags and request limiting would reduce overheat from clients polling the same files over and over again to absolute minimum. And not only we could get away without having to maintain additional communication protocol, but we could awoid invoking our backend code at all.

Do I miss something here?

Myst
  • 18,516
  • 2
  • 45
  • 67
  • This idea needs some fleshing out to show how you think it would actually work. Some measurements comparing latency and overall bandwidth with the alternatives. Really it sounds like this should be a blog post, not a StackOverflow question. – Darren Cook Mar 17 '18 at 10:03
  • IMHO this would be a step backwards rather than forwards. You can have a look at the many discussions on SO, such as [this discussion](https://stackoverflow.com/questions/10377384/why-use-ajax-when-websockets-is-available/47945952?noredirect=1#comment84388444_47945952), [this one](https://stackoverflow.com/questions/44731313/at-what-point-are-web-sockets-less-efficient-than-polling/44743650#44743650), [this one](https://stackoverflow.com/questions/14703627/websockets-protocol-vs-http/14710349) and [this one](https://stackoverflow.com/a/32257946/4025095). – Myst Mar 17 '18 at 12:47

2 Answers2

1

IMHO this would be a step backwards rather than forwards. You can have a look at the many discussions on SO, such as this discussion, this one and this one.

In this SO thread there's a good discussion about this question and you will find some of the additional costs related to your approach.

In short, using polling (even after optimization techniques such as your suggested "static file service" / http/2 / memcached, etc'), will always consume more resources than push techniques such as WebSockets.

For example, header parsing, cache validation, (authentication where required) etc' are all repeated for each poll request and can be easily avoided by pushing the data.

Myst
  • 18,516
  • 2
  • 45
  • 67
  • This is effectively a link-only answer that didn't add anything from your comment above. – Brad Mar 28 '18 at 00:56
  • This does not provide an answer to the question. To critique or request clarification from an author, leave a comment below their post. - [From Review](/review/low-quality-posts/19252984) – Gilles Gouaillardet Mar 28 '18 at 01:08
  • @Brad - IMHO, answers that point to SO answers are valid and help Google post future inquiries to the better threads (vs. external link answers which can't be safely maintained). They also discourage duplicate threads/questions. After posting my comment I realized it actually answered the question and posted the same content as an answer instead. – Myst Mar 28 '18 at 01:09
  • @GillesGouaillardet, I updated the answer, but I also wonder - the question is clearly a repeated question with a number of existing threads - wouldn't it be better to link to the better threads than to create a new one? – Myst Mar 28 '18 at 01:17
  • it this is a duplicate, then mark it as such. otherwise a comment is fine. – Gilles Gouaillardet Mar 28 '18 at 01:22
  • @GillesGouaillardet , if it was clear cut, I would have flagged the question. I opened a [discussion on Meta](https://meta.stackoverflow.com/questions/365210/different-questions-with-the-same-answer-best-practice) in hopes to learn more. – Myst Mar 28 '18 at 01:32
  • that sounds the right thing to do. fwiw, I did not downvote but flagged for removal. – Gilles Gouaillardet Mar 28 '18 at 01:40
1

While I was looking into video streaming solutions, I learned that current way to go there is to put your data in form of static files

There's actually quite a bit of overhead with this method. It isn't ideal. The only reason people do this is to re-use existing HTTP file/blob-based CDNs for video streaming.

The latency is high, as segments have to be written out and uploaded. Even if you stream the segments coming in to clients, you have the overhead of having the manifest. Even if you do away with the manifest, you have the overhead of having a client requesting segments. Even if you use push with HTTP/2, all of this complexity still exists.

Simply put, DASH and HLS are hacks that are designed to solve a specific need. The only reason they're viable at all is that the payloads are relatively large.

So I started thinking if this could be also the way to go with message communication. Just put all data you want your clients to have fresh and synced into form of static files and set up nginx to serve them.

I'm assuming your messages are much smaller than video data. It's probably not worth the overhead.

Taking advantage of things like http/2, memcached, last-modified tags and request limiting would reduce overheat from clients polling the same files over and over again to absolute minimum.

There's still significant overhead. Ideally, you would use HTTP/2 and push the resource, but this again requires a specialized server.

And not only we could get away without having to maintain additional communication protocol, but we could awoid invoking our backend code at all.

Correct.

At the end of this, you need to consider the trade-offs you're making. Some things to consider:

  • How often are you going to update the data?
  • How often are your clients going to poll for updated data?
  • How big is your data?

If you're polling infrequently, or for larger data updates, the overhead is probably fine.

Brad
  • 159,648
  • 54
  • 349
  • 530