8

I am building a file synchronization program (not unlike Dropbox) using node.js on both ends. I need to have potentially thousands of clients requesting data at the same time.

Here is my current system:

  • Server pushes notifications to client over a websocket (file has been updated)
  • Client queues downloads and makes an HTTP request when idle

I will be serving data in compressed chunks of, say, 50 MB each, so the HTTP request overhead (headers) is negligible.

If I were to use websockets for requests and push notifications, would there be:

  • Noticeable overall speed improvements? (reduced latency, authentication, etc.)
  • Additional overhead on the server to keep connections open?
  • Issues with pushing binary data?

I think I need to have notifications sent over a dedicated websocket because I don't want them to be queued on the server while a download is taking place (lots of overhead).

Note: These websockets will be open long-term, as long as the client's system is on.

EDIT: I will be using the websockets on a different http server on different ports in order to move them to different CPU cores. I could potentially have thousands (if not hundreds of thousands) of concurrent websockets open...

beatgammit
  • 19,817
  • 19
  • 86
  • 129
  • "using node.js on both ends" - so both, client and server, would have node.js installed? Client won't be a browser for example? – yojimbo87 Apr 04 '11 at 09:21
  • Yeah, no browser. I'll have complete control over requests and responses. It'll be a desktop app that runs in the background. simple HTTP server, simple HTTP client. – beatgammit Apr 04 '11 at 14:49

2 Answers2

5

If you intend to use node.js for both client and server then you should use native net module with pure sockets rather than WebSockets. Pure sockets are much better optimized for data transfer, especially binary. As far as I know browser WebSockets do not even support binary transfer yet.

yojimbo87
  • 65,684
  • 25
  • 123
  • 131
  • 1
    I'll have to look into that. Are there any statistics or benchmarks comparing two equivalent servers under high load? – beatgammit Apr 05 '11 at 16:49
  • I don't about any benchmark comparing pure sockets with WebSockets, but there was a similar [question](http://stackoverflow.com/questions/5509905/websockets-between-2-servers) on this topic recently. – yojimbo87 Apr 05 '11 at 18:41
  • Yeah, there are some pretty nice libraries, such as [socket.io-node](https://github.com/LearnBoost/Socket.IO-node). If nobody posts any benchmarks or something more complete than your answer, I'll probably mark yours as correct. – beatgammit Apr 05 '11 at 18:46
  • 1
    You are definitely right about websockets not supporting binary transfer. I think a regular TCP socket is what I'm looking for. Thanks so much for the help!! – beatgammit Apr 07 '11 at 17:14
  • 4
    For clarity for future viewers, websocket _does_ support binary transfer but javascript has no binary type which makes it imposible to transfer binary data to browsers over websocket, but one can use websocket with a different client to transfer websockets. see: http://dev.w3.org/html5/websockets/ – Umur Kontacı Sep 02 '12 at 16:43
3

I was searching around for something else and I found this post that explains websockets pretty well:

http://blog.new-bamboo.co.uk/2009/12/7/real-time-online-activity-monitor-example-with-node-js-and-websocket

Here are some pretty interesting parts from the article:

Websocket enables you to have continuos communication in significantly less network overhead compared to existing solution.

And:

During making connection with WebSocket, client and server exchange data per frame which is 2 bytes each, compared to 8 kilo bytes of http header when you do continuous polling.

For my use case (no browser), this seems like the optimal solution! Now I just need to decide whether I want to have a single websocket or multiple websockets per client (I'm leaning towards a single one at this point).

I really hope this is useful to someone. I'll leave this open for the time being. I'll be attending a JS conference later this week, and if I learn anything more I'll add to this post.

For those of you who care about browser support, see this. It looks like WebKit is the only reliable engine that supports it (Chrome and Safari).

beatgammit
  • 19,817
  • 19
  • 86
  • 129
  • Hi tjameson, did things went well with you regarding the web-socket implementation of file transfer project? I'm about to start an implementation of a similar project in Python in both server and client sides. Practically speaking, is it a good practice to do it in websockets or shall I only use it for messaging and get the file transfer part done in HTTP instead? Let me know your ideas from your experience. Thanks. – securecurve Dec 27 '12 at 16:08
  • Since you're using Python, I can only assume you want something easy and fast enough, which HTTP is. It offers a great structure and is well supported. For an Arduino project, where the data was pretty small, I implemented the WebSocket protocol without the HTTP handshake. It's pretty simple (one byte header, 1-8 bytes for message length) and HTTP is hard on Arduino. If you need raw speed, look into that, otherwise just stick with HTTP until you run into problems. – beatgammit Dec 27 '12 at 23:56