Everyone here is on the right track, but to put the bed the issue, you cannot call .setEncoding()
EVER.
If you call .setEncoding()
, it will create a StringDecoder
and set it as the default decoder. If you try to pass null
or undefined
, then it will still create a StringDecoder
with its default decoder of UTF-8
. Even if you call .setEncoding('binary')
, it's the same as calling .setEncoding('latin1')
. Yes, seriously.
I wish I could say you set ._readableState.encoding
and _readableState.decoder
back to null
, but when you call .setEncoding()
buffer gets wiped and replaced with a binary encoding of the decoded string of what was there before. That means your data has already been changed.
If you want to "undo" the decoding, you have to re-encode the data stream back into binary like so:
req.on('data', (chunk) => {
let buffer;
if (typeof chunk === 'string') {
buffer = Buffer.from(chunk, req.readableEncoding);
} else {
buffer = chunk;
}
// Handle chunk
});
Of course, if you never call .setEncoding()
, then you don't have to worry about the chunk being returned as a string
.
After you have a your chunk as Buffer
, then you can work with it as you chose. In the interested of thoroughness, here's how to use with a preset buffer size, while also checking Content-Length
:
const BUFFER_SIZE = 4096;
/**
* @param {IncomingMessage} req
* @return {Promise<Buffer>}
*/
function readEntireRequest(req) {
return new Promise((resolve, reject) => {
const expectedSize = parseInt(req.headers['content-length'], 10) || null;
let data = Buffer.alloc(Math.min(BUFFER_SIZE, expectedSize || BUFFER_SIZE));
let bytesWritten = 0;
req.on('data', (chunk) => {
if ((chunk.length + bytesWritten) > data.length) {
// Buffer is too small. Double it.
let newLength = data.length * 2;
while (newLength < chunk.length + data.length) {
newLength *= 2;
}
const newBuffer = Buffer.alloc(newLength);
data.copy(newBuffer);
data = newBuffer;
}
bytesWritten += chunk.copy(data, bytesWritten);
if (bytesWritten === expectedSize) {
// If we trust Content-Length, we could return immediately here.
}
});
req.on('end', () => {
if (data.length > bytesWritten) {
// Return a slice of the original buffer
data = data.subarray(0, bytesWritten);
}
resolve(data);
});
req.on('error', (err) => {
reject(err);
});
});
}
The choice to use a buffer size here is to avoid immediately reserving a large amount of memory, but instead only fetch RAM as needed. The Promise
functionality is just for convenience.