1

I'm using a simple Node.js server to send a large JSON file to the client, where each line has a self-contained JSON object. I'd like to send this file to the client one line at a time. But my problem is that the server waits until response.end() has been called to send the whole thing at once.

The code for my server looks like this:

http.createServer(async function (request, response) {
   response.writeHead(200, {"Content-Type": "application/json; charset=UTF-8", "Transfer-Encoding": "chunked", "Cache-Control": "no-cache, no-store, must-revalidate", "Pragma": "no-cache", "Expires": 0});
   response.write(JSON.stringify('["The first bit of JSON content"]\n'));
   response.write(await thisFunctionTakesForever());
   response(end);
}

I really don't want to make the user wait until the entire JSON file has been loaded before my script can start parsing the results. How can I make my server send the data in chunks?


Additional info: How do I know my Node.js server isn't sending any part of the file until after response.end has been called?

I'm using XMLHttpRequest to handle the chunks as they arrive. I understand that http.responseText always grows with each chunk, so I filter through it to find the new lines that arrive each time:

let http = new XMLHttpRequest();
http.open('GET', url, true);
http.setRequestHeader('Content-type', 'application/x-www-form-urlencoded');
http.onreadystatechange = function() {
    if(http.readyState >= 3 && http.status == 200) {
        // Parse the data as it arrives, throwing out the ones we've already received
        // Only returning the new ones
        let json = http.responseText.trim().split(/[\n\r]+/g)
        let dataChunks = json.map(e => JSON.parse(e));

        let newResults = [];
        for(let i=0; i<dataChunks.length; i++)
        {
            if(!previousResults.map(e => e[0]).includes(dataChunks[i][0]))
            {
                newResults.push(dataChunks[i]);
            }
        }
        previousResults = previousResults.concat(newResults);
    }
}
http.send();

The array previousResults should grow slowly over time. But instead, there's a huge delay, then everything suddenly appears all at once.

The following thread is related. But unfortunately, none of the proposed solutions solved my problem... Node.js: chunked transfer encoding

jabaa
  • 5,844
  • 3
  • 9
  • 30
Caspian
  • 141
  • 5
  • 1
    It seems like you want a WebSocket server. HTTP doesn't allow multiple responses for one request. – jabaa Aug 19 '22 at 07:38
  • According to the manual, "The first time response.write() is called, it will send the buffered header information and the first chunk of the body to the client. The second time response.write() is called, Node.js assumes data will be streamed, and sends the new data separately." Oh well, I'll try implementing a WebSocket server... because that's clearly not the way I see response.write() working... – Caspian Aug 19 '22 at 07:48
  • 1
    [Server Sent Events](https://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events/Using_server-sent_events) (also suggested in the comments in the accepted answer below) work a lot like your approach: you write a line (message event), it gets sent and handled client-side, repeat. In some ways it's easier than WebSockets as it's _just HTTP_ (preferably HTTP/2), but make sure any proxy servers in-between don't buffer the responses. – RickN Aug 19 '22 at 09:43
  • Wow, I didn't even know server sent events were a thing. I'll definitely check it out! – Caspian Aug 19 '22 at 13:09

1 Answers1

1

I saw you are using chunked encoding:"Transfer-Encoding": "chunked". This kind of encoding type is going to transfer each chunk individually. It's really possible to write each chunk immediately without waiting for others.

Each chunk will be encapsulated with the format defined in the RFC 2612 by the http library. In general, each chunk has one line indicating the chunk size following a <CR>, <LF>. Then you can send the chunk content. And the last chunk is an exception indicating all the chunks are finished.

I could give you an example below:

const http = require("http")

function generateChunk(index, res, total) {
    setTimeout(() => {
        res.write(`<p> chunk ${index}</p>`)
        if (index === total) {
            res.end()
        }
    }, index * 1000)
}

function handlerRequest(req, res) {
    res.setHeader("Content-Type", "text/html; charset=UTF-8")
    res.setHeader("Transfer-Encoding", "chunked")
    let index = 0
    const total = 5
    while (index <= total) {
        generateChunk(index, res, total)
        index++
    }
}

const server = http.createServer(handlerRequest)
server.listen(3000)
console.log("server started at http://localhost:3000")%

If you capture the TCP packets you will see different chunks in different TCP packets. They don't have any dependency.

enter image description here

See the image:

  1. Each PSH packet carries out a chunk.
  2. There is a delay between each chunk transmission.

However, the HTTP client (like the browser) must accept all the chunks before handing them over to the application for the reasons: Once all the chunks are received, the server could also send some headers - trailer headers. Those headers include Content-MD5, Content-Length, etc. The client must verify like Content-MD5 once all chunks are received before handing them over to the application. I think that's why you can't receive chunks one by one on the browser side.

Pylon
  • 698
  • 6
  • 9
  • 1
    Thanks so much for taking the time to explain exam exactly what I'm doing right and wrong. I'll have to rethink my client, not my server. Thanks for your help! – Caspian Aug 19 '22 at 09:14
  • 1
    Try to use [SSE](https://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events/Using_server-sent_events) or even websocket if needed – Pylon Aug 19 '22 at 09:16
  • I'll definitely look into it. Thanks! – Caspian Aug 19 '22 at 13:09