6

I wrote an Express middleware to retrieve the raw body from the request, and I set it before body-parser middleware.

My custom middleware is calling req.setEncoding('utf8'), but this causes the following body-parser error:

Error: stream encoding should not be set

at readStream (/node_modules/body-parser/node_modules/raw-body/index.js:211:17) 
at getRawBody (/node_modules/body-parser/node_modules/raw-body/index.js:106:12)
at read (/node_modules/body-parser/lib/read.js:76:3)
at jsonParser (/node_modules/body-parser/lib/types/json.js:127:5)

Here is my code:

var express = require('express');
var bodyParser = require('body-parser')

function myMiddleware() {
  return function(req, res, next) {
    req.rawBody = '';
    req.setEncoding('utf8');

    req.on('data', function(chunk) {
      req.rawBody += chunk;
    });

    req.on('end', function() {
      next();
    });
  }
}

var app = express();
app.use(myMiddleware());
app.use(bodyParser.json());

var listener = app.listen(3000, function() {
});

app.get('/webhook/', function (req, res) {
  res.sendStatus(200);
});

Is there a way to unset the encoding? Is there another way to retrieve the raw body, but still use body-parser after it?

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
kiewic
  • 15,852
  • 13
  • 78
  • 101
  • 1
    Use your middleware after bodyParser? – nicovank Nov 16 '16 at 17:04
  • You have a typo in `res.sendStatu(200);` as well. – doublesharp Nov 16 '16 at 17:09
  • Are you sure you need to set the encoding? – doublesharp Nov 16 '16 at 17:13
  • @doublesharp typo fixed, thanks! – kiewic Nov 16 '16 at 17:18
  • @nicovank If I change the middleware order, my custom middleware hangs. I guess it is because the stream has been already consumed. I am investigating what it could be. – kiewic Nov 16 '16 at 17:19
  • @doublesharp You are right, I probably do not need to call `setEncoding()`, I thought I was required too, because without it, the app hangs. Now I understand that whoever tries to read the body for second time will hang, in this case *body-parser*. – kiewic Nov 16 '16 at 17:22
  • Right, because `next()` isn't getting called until after `end` is emitted on the data. Try just setting the event handlers and then calling `next()` at the end, *not* in a handler. – doublesharp Nov 16 '16 at 17:23
  • D'oh, your custom middleware is incorrect. The function inside `myMiddleware` is never getting called, so next is never getting called. – doublesharp Nov 16 '16 at 17:28

3 Answers3

11

It turns out that body-parser has a verify option to call a function when the request body has been read. The function receives the body as a buffer.

Here is an example:

var express = require('express');
var bodyParser = require('body-parser')

function verifyRequest(req, res, buf, encoding) {
  // The raw body is contained in 'buf'
  console.log( buf.toString( encoding ) );
};

var app = express();
var listener = app.listen(3000);

// Hook 'verifyRequest' with body-parser here.
app.use(bodyParser.json({ verify: verifyRequest }))

app.post('/webhook/', function (req, res) {
  res.status(200).send("done!");
});
Salathiel Genese
  • 1,639
  • 2
  • 21
  • 37
kiewic
  • 15,852
  • 13
  • 78
  • 101
5

You are calling next() inside "done", which means the stream has already been consumed. Instead, set up the handler for "data" then pass the request along using next(). The "done" event is likely being handled inside bodyParser so after it executes you have access to req.rawBody. If this was not the case you would add another middleware that calls next() inside a req.on('done') to hold the rest from processing until you have all the data.

// custom middleware - req, res, next must be arguments on the top level function
function myMiddleware(req, res, next) {
  req.rawBody = '';

  req.on('data', function(chunk) {
    req.rawBody += chunk;
  });

  // call next() outside of 'end' after setting 'data' handler
  next();  
}

// your middleware
app.use(myMiddleware);

// bodyparser
app.use(bodyParser.json())

// test that it worked
function afterMiddleware(req, res, next) {
  console.log(req.rawBody);
  next();  
}

app.use(afterMiddleware);

If you need to access the raw body you might also want to look into bodyParser.raw(). This will put the raw body into req.body, same as bodyParse.json() but can be made to run conditionally based on the content type - check out options.type.

doublesharp
  • 26,888
  • 6
  • 52
  • 73
  • 1
    `in your example the inner function is never being called` yes it is.... the function `myMiddleware` is returning the function and he calls it later `app.use(myMiddleware())` – nicovank Nov 16 '16 at 17:52
  • You are correct, but it isn't passing in `req, res, next` so they aren't executed in the right context – doublesharp Nov 16 '16 at 17:54
  • Writing the middleware handler as an inner function, allow callers to pass parameter-options to the middleware during setup time. – kiewic Nov 16 '16 at 18:00
  • I have a new problem with this solution, `app.get('/', function (req, res) { })` is not being called anymore. – kiewic Nov 16 '16 at 18:25
1

I recommend a different approach, since your current approach actually consumes the message and makes it impossible for body-parser to read it (and there are a bunch of edge case bugs that spring up by calling next synchronously):

app.use(bodyParser.json());
app.use(bodyParser.text({type: '*/*'}));

This will read any application/json request as JSON, and everything else as text.

If you must have the JSON object in addition to the text, I recommend parsing it yourself:

app.use(bodyParser.text({type: '*/*'}));
app.use(myMiddleware);

function myMiddleware(req, res, next) {
    req.rawBody = req.body;
    if(req.headers['content-type'] === 'application/json') {
        req.body = JSON.parse(req.body);
    }
    next();
}
tcooc
  • 20,629
  • 3
  • 39
  • 57