7

I'm trying to pull the html content of a given url and the origin content encoding is utf-8. I get the html of the page but the text whitin the html elemnts are returned in bad format (question marks).

This is what I do:

var parsedPath = url.parse(path);
var options = {
    host: parsedPath.host,
    path: parsedPath.path,
    headers: {
        'Accept-Charset' : 'utf-8',
    }
}

http.get(options, function (res) {
    var data = "";
    res.on('data', function (chunk) {
        data += chunk;
    });
    res.on("end", function () {
        console.log(data);
    });
}).on("error", function () {
    callback(null);
});

How can I enforce the encoding of the returned data?

Thanks

Ben Diamant
  • 6,186
  • 4
  • 35
  • 50
  • A client can't force a server to return data in any particular format. It can tell it that it only accepts data in a particular format and then the server can choose which format to send back … or it can ignore the information entirely. – Quentin Jan 17 '15 at 11:27
  • @Quentin thanks, so the header I use is needless. How can I encode the data instead? when using postman to get the html with GET call the data is well written so I assume there's a way to achieve it – Ben Diamant Jan 17 '15 at 11:29

1 Answers1

14

Use the setEncoding() method like this:

http.get(options, function (res) {
    res.setEncoding('utf8');

    var data = "";
    res.on('data', function (chunk) {
        data += chunk;
    });
    res.on("end", function () {
        console.log(data);
    });
});
KyleMit
  • 30,350
  • 66
  • 462
  • 664
alexpods
  • 47,475
  • 10
  • 100
  • 94