0

I have a simple express app that on request launches a worker, the worker generates a lot of data (5 MB, takes about 3 seconds) and sends over the first 50KB of it. Incase the same request is repeated, the worker gets a message to send over the next 50KB, otherwise it terminates after some time.

The issue with this setup is that the mainthread spends time unnecessarily by decoding the message and sending it again via express. I want to avoid this overhead as much as possible.

I have looked into using Node Clusters, but they seem to be unfit for this as I am unable to cache my data to respond with the next part of it quickly. It seems like I am unable to decide which part of the cluster gets which request, which is essential here. Is there an easy solution I am missing, or is Node not the right framework for this issue?

Puasfeq
  • 33
  • 2

1 Answers1

0

I think that depends on your use case.

If you're running in an environment where there's some sort of proxy (e.g. Nginx or caddy) running, you should be able to set up each worker to listen on a different port and then configure your proxy to send the request to the correct worker.

If you're running some sort of message queue (e.g. RabbitMQ), it should be easy to use a modified Saga pattern to allow the main thread to only handle deciding which worker takes the request and have the workers respond directly to the queue.

Another interesting peace of information is this question which describes how linux/unix systems can pass a file descriptor (The TCP connection to your client is a file descriptor in unix) on to another process, so you can:

app.get("/foo", (req, res) => {
    const fd = res.socket._handle.fd || res.socket._parent._handle.fd;
    worker.sendMessage({
        type: "gotRequest",
        request: req,
        responseFD: fd,
    });
});

The worker can then open this file descriptor as a writable stream and write its response to it. It's certainly not simple, but it does seem to be the way to go in this kind of a situation.

**Also, the Cluster API probably is your best fit. You should be able to cache pretty easily just by storing the data you need cached in a file, or by running Redis or MemCached as a cache.

Dharman
  • 30,962
  • 25
  • 85
  • 135
Jake
  • 2,090
  • 2
  • 11
  • 24