My customer is all of a sudden experiencing problems with a HTML scraper job made with Node.js. I have circled in on the cause, and found that it's located in the Request module. That made me write a small test application, which solely gets the HTML of the given URL via the Request module. Like this:
var request = require('request');
request('https://www.politi.dk/da/ompolitiet/jobipolitiet/ledige_stillinger/ledigestillinger', function(err, res, body){
if(err){
console.log(err);
} else {
console.log('statusCode:', res.statusCode);
console.log('statusMessage:', res.statusMessage);
}
});
The above example does not work though, as I am getting the following error when running the application:
{ Error: socket hang up
at TLSSocket.onHangUp (_tls_wrap.js:1137:19)
at Object.onceWrapper (events.js:313:30)
at emitNone (events.js:111:20)
at TLSSocket.emit (events.js:208:7)
at endReadableNT (_stream_readable.js:1064:12)
at _combinedTickCallback (internal/process/next_tick.js:138:11)
at process._tickCallback (internal/process/next_tick.js:180:9)
code: 'ECONNRESET',
path: null,
host: 'www.politi.dk',
port: 443,
localAddress: undefined }
However if I change the URL to any other URL it works and I get the following:
statusCode: 200
statusMessage: OK
I have tried passing other URL's on the politi.dk domain, which doesn't work either. Therefore I can conclude that there's a problem with this domain, when requesting pages via the Request module. The strange thing is just, that it worked up until recently. What can cause this problem? Can some changes in settings be made to the server of politi.dk, that is causing this now? I find it hard to find anything helpful on Google. I found the nodejs-what-does-socket-hang-up-actually-mean thread here on SO, which is the exact same problem. But the answers doesn't help me much.
Anyone?