5

I have a nodejs application that aggregates contents from various websites. Requests are made to fetch the feeds from different sources asynchronously using request streams. I get the socket hangup error pretty often when the requests are made.

err in accessing the link { Error: socket hang up
    at createHangUpError (_http_client.js:331:15)
    at TLSSocket.socketOnEnd (_http_client.js:423:23)
    at emitNone (events.js:111:20)
    at TLSSocket.emit (events.js:208:7)
    at endReadableNT (_stream_readable.js:1064:12)
    at _combinedTickCallback (internal/process/next_tick.js:139:11)
    at process._tickDomainCallback (internal/process/next_tick.js:219:9) code: 'ECONNRESET' } https://arstechnica.com/?p=1488489 

Environment details: node version - v8.12.0

Tried out a few suggestions given in related SO posts, but I still get the same error. NodeJS - What does "socket hang up" actually mean?

import request from 'request';
import FeedParser from 'feedparser';

const extractor = require('unfluff');

export const getFeedsFromSource = function (urlfeed, etag, LastModified, callback) {
  console.log(urlfeed, etag, LastModified);
  const req = request({
    method: 'GET',
    url: urlfeed,
    headers: {
      'If-None-Match': etag,
      'If-Modified-Since': LastModified,
      Connection: 'keep-alive',
      ciphers: 'DES-CBC3-SHA',
    },
  });
  const feedparser = new FeedParser();
  const metaData = {};
  const htmlData = {};
  const feedData = {};
  // const pList = null;
  req.on('response', function (response) {
    const stream = this;
    if (response.statusCode === 304) {
      console.log('Source not modified: ', urlfeed);
    }
    if (response.statusCode === 200) {
      metaData.etagin = response.headers.etag;
      metaData.LastModifiedin = response.headers['last-modified'];
      metaData.LastModifiedLocal = response.headers['last-modified'];
      stream.pipe(feedparser).end();
    }
  });
  req.on('error', (err) => {
    console.log(`getFeed: err.message == ${err.message}`);
    callback(err);
  });
  // req.end();
  feedparser.on('readable', function () {
    try {
      const item = this.read();
      if (item !== null) {
        request({
          method: 'GET',
          url: item.link,
        }, (err, info) => {
          if (!err) {
            htmlData.body = info.body;
            const parsedData = extractor(htmlData.body, 'en');
            feedData.author = [];
            feedData.videos = [];
            feedData.feedtitle = parsedData.title;
            feedData.feedmainpicture = parsedData.image;
            feedData.feedsummary = parsedData.description;
            feedData.feedmaincontent = parsedData.text;
            feedData.author.push(item.author);
            if (item.author === null) {
              feedData.author = parsedData.author;
            }
            feedData.feedurl = item.link;
            feedData.copyright = item.meta.copyright;
            // feedData.videos = parsedData.videos;
            feedData.publishedDate = item.pubdate;
            if (item.categories.length > 0) {
              feedData.categories = item.categories;
              feedData.feedtags = item.categories;
            } else if (parsedData.keywords !== undefined) {
              feedData.categories = parsedData.keywords.split(' ').join('').split(',');
              feedData.feedtags = parsedData.keywords.split(' ').join('').split(',');
            } else {
              feedData.categories = [];
              feedData.feedtags = [];
            }
            metaData.sourcename = item.meta.title;
            callback(undefined, feedData, metaData);
          } else {
            console.log('err in accessing the link', err, item.link);
          }
        });
      }
    } catch (err) {
      console.log(`getFeed: err.message == ${err.message}`);
    }
  });
  feedparser.on('error', (err) => {
    console.log(`getFeed: err.message == ${err.message}`);
  });
  feedparser.on('end', () => {
    console.log('onend');
  });
};

Kindly help me out with this issue.

omnathpp
  • 135
  • 1
  • 2
  • 8
  • There are many reasons for actual hangup, can you add more information about the request load on the server/app? (i.e. how many requests per min/sec and how many feeds per request) also do you know whether the hangup is on specific sites or all? and lastly are you running in a cluster? – Shachar Apr 11 '19 at 07:48
  • 1
    There are a total of 150 feed sources and I scrape every feed from the sources, so the number of requests sent range between 600 to 1400 requests. The hangup occurs on different sites, not a specific site. Am not running a cluster. – omnathpp Apr 11 '19 at 08:01
  • 1
    I even get the following error sometimes err in accessing the link { Error: read ECONNRESET at TLSWrap.onread (net.js:622:25) errno: 'ECONNRESET', code: 'ECONNRESET', syscall: 'read' } http://feedproxy.google.com/~r/businessinsider/~3/a0vQbcSuNeg/bi-insider-panel-2016-9 – omnathpp Apr 11 '19 at 08:10

1 Answers1

3

There are many reasons for socket hangup/reset in production apps. From your description I believe the cause isn't due to app overloading with requests (unless you're running a very slow machine). IMO, the most likely candidate is throttling by remote server due to too many connections from same ip (chrome opens upto 8 connections to any single server, you should try not to exceed this limit, despite each server having different limit), to solve this you should do one of the following:

  • add host request pooling (basically set Agent.maxSockets)
  • use proxy service (e.g. Luminati) to distribute requests over many source ips (more relevant for high concurrency requirements)

One more thing to remember, requests can fail for 'natural' networking reasons (e.g. bad\unstable internet connection, server busy spikes), you should always do at least one retry of request before giving up.

Shachar
  • 359
  • 2
  • 6
  • 1
    great, thanks for the suggestions @Shachar. I tried increasing the maxSockets but still ended up getting the socket hangup error. Also configured the luminati proxy service for the requests but still getting the same error. – omnathpp Apr 11 '19 at 12:45
  • @omnathpp I hope you understand that when I said setting `maxSockets` I meant setting it to 8 to avoid too many connections. How did you setup Luminati? you to have it change ip every request/session (I have experience with that) – Shachar Apr 13 '19 at 11:26