1

When you have a single webserver with multiple nodejs applications, the typical thing you do is slap some reverse proxy in front of them all and write rules. Nginx is currently a favorite because it's fast, so I'll focus on that, but the question can also be applied to other webservers.

It struck me today that it sounds a bit inefficient. After all, the same HTTP request needs to be parsed twice - first by the proxy, then by nodejs. And the protocol itself is also text based with plenty of edge cases in the parser... couldn't the coupling between the webserver and nodejs be made more efficient so that the request needs to be parsed only once?

My first idea was about fastcgi, but it turns out that just makes things worse because it limits nodejs to processing one request at a time. And anyway it's pretty dated.

Then I dug a bit more and found SCGI which seems even better and is even supported by nginx... but not on the NodeJs side, it seems.

And lastly I found Apache JServ protocol but support is even worse with neither nginx nor nodejs supporting it as far as I can tell.

Why is this? I understand that this overhead is probably small compared to everything else that happens in a typical request, but is it really so insignificant that it's not even worth putting any effort into it? Even small gains can add up, and there could easily be a simple package that replaces Node's http.Server with a compatible scgi.Server with minimum effort.

Vilx-
  • 104,512
  • 87
  • 279
  • 422
  • My guess is that the gains are so small that it isn't worth the complexity of having to invent a public, but pre-parsed http mechanism and then build that into both nginx and node.js. – jfriend00 May 10 '20 at 01:13
  • Oh, and it has to keep up with all the changes/improvements/features and new versions of http happening. – jfriend00 May 10 '20 at 02:25
  • @jfriend00 Oh, yeah, I forgot about http/2 and http/3. Actually - isn't this even more a reason for abstracting this away, so that only the webserver needs to worry about it and the appliction can focus on other things? – Vilx- May 10 '20 at 10:12
  • Well, node.js is never going to require a separate web server - it will always have the capability of being a web server too as that is perfectly fine for many deployments. So, it's not like node.js is going to stop having the capability of being a web server and thus having to keep up with all the http standards itself. – jfriend00 May 10 '20 at 16:28
  • Before you go down this theoretical path further, I think you'd have to focus on how big the gains would actually be. So, you parse the http headers on nginx and then convert them to what? You then deliver that converted thing to node.js and it has to understand it. So it has to be some sort of protocol between nginx and node.js , but whatever it is, it has to be so much simpler to read that you get a meaningful performance gain. I guess it would have to be a binary thing laid out in records if you're not going to have to "parse" it. How much would that really save? – jfriend00 May 10 '20 at 16:31
  • @jfriend00 - I... don't really know. That's another thing - I would really like to see a benchmark, but if nobody has tried... I don't know. All I thought was - "hey, it looks like the same work is being done twice and this could be avoided with just a bit of effort". And it's not so much the parsing of correctly submitted headers as correctly handling all the _invalid_ headers that hackers might want to try in order to break in. The security aspect also then could get handed over to the webserver and the node.js part could be made much more simple because it would simply trust that. – Vilx- May 10 '20 at 22:05
  • It _feels_ like this is a non-trivial amount of work here... but, yeah, without benchmarks, it's just a guess... – Vilx- May 10 '20 at 22:06
  • You seems to be assuming a node.js that would NEVER be the front facing web server (that is not what node.js is) instead of one that can sometimes not be the front facing web server. So, if you can't ever realize that savings from never being the front facing web server, then you're always stuck with double work in node.js down your path. I, for one, appreciate the fact that I can spin up a simple web site in node.js without having to deploy another web server. I run one on my Raspberry PI, for example, in a pretty small amount of memory and it controls some HVAC things. – jfriend00 May 10 '20 at 22:13
  • @jfriend00 - True, and I don't want to say that Node.js should lose this capability. Not at all, it's a good feature to have and it should stay there and evolve. However it only works when you have just one application on your machine. When you have several (say, because microservices), you need something in front of them that would look at the requests and figure out where each of them should go. If this was Apache+PHP, then PHP was a module inside the Apache process and communication was easy and fast. Since Node can't do that, then I'm left wondering what's the best alternative? – Vilx- May 10 '20 at 22:47
  • If you're deploying yourself, many people use nginx with node.js for that. In larger commercial deployments, there may be dedicated hardware/load balancer that serves that purpose. – jfriend00 May 11 '20 at 00:35

0 Answers0