126

I want to open a page up in node and process the contents in my application. Something like this seems to work well:

var opts = {host: host, path:pathname, port: 80};
http.get(opts, function(res) {
  var page = '';
  res.on('data', function (chunk) {
    page += chunk;
  });
  res.on('end', function() {
     // process page
  });

This doesn't work, however, if the page returns an 301/302 redirect. How would I do that in a reusable way in case there are multiple redirects? Is there a wrapper module on top of the http to more easily handle processing http responses from a node application?

hippietrail
  • 15,848
  • 18
  • 99
  • 158
Zach
  • 18,594
  • 18
  • 59
  • 68

9 Answers9

148

If all you want to do is follow redirects but still want to use the built-in HTTP and HTTPS modules, I suggest you use https://github.com/follow-redirects/follow-redirects.

yarn add follow-redirects
npm install follow-redirects

All you need to do is replace:

var http = require('http');

with

var http = require('follow-redirects').http;

... and all your requests will automatically follow redirects.

With TypeScript you can also install the types

npm install @types/follow-redirects

and then use

import { http, https } from 'follow-redirects';

Disclosure: I wrote this module.

Akora
  • 756
  • 7
  • 22
Olivier Lalonde
  • 19,423
  • 28
  • 76
  • 91
  • This is now here: https://github.com/request/request/blob/d18cb487cf4e02740778d4bb464b5a1c7441d800/lib/redirect.js#L46 – Adrian Lynch Mar 09 '16 at 16:51
  • 4
    This is way better than the accepted answer featuring `request` which would add 20+ new dependencies to your module for such a simple task. Thank you for keeping npm modules lightweight, Oliver! :) – Sainan Jan 16 '19 at 12:04
  • Doesn't work when I use it with audio hosted securely on s3. – thedreamsaver May 18 '19 at 16:40
  • Using TypeScript add this to your npm install: npm install @types/follow-redirects so you can use import {https} from 'follow-redirects'; This is a fantastic, simple, oh so efficient module. Merci Olivier ! – Louis-Eric Simard Oct 26 '19 at 19:45
  • I wasn't able to use things like [cacheable-lookup](https://github.com/szmarczak/cacheable-lookup) to work with this, unfortunately – Zathrus Writer Aug 13 '23 at 18:01
50

Is there a wrapper module on top of the http to more easily handle processing http responses from a node application?

request

Redirection logic in request

neo
  • 791
  • 5
  • 7
Raynos
  • 166,823
  • 56
  • 351
  • 396
  • 32
    Why the living b'jesus isn't this part of the built-in http module?! – aaaidan Oct 23 '12 at 08:02
  • 2
    It is. It's called `http.request` the API is pretty simple. – Raynos Oct 23 '12 at 20:24
  • 3
    Is it possible to have a callback for each redirect somehow? I'd like to store every single URL the request goes through. Couldn't find it in the docs. – Ignas Jun 26 '13 at 13:49
  • 16
    @Raynos, the request() method of the built-in `http` module does not follow redirects, therefore this is not part of the built-in `http` module. – gilad905 Nov 10 '16 at 16:44
  • This should be the top answer since it works as drop-in replacement to the built-in node client: https://stackoverflow.com/a/13390241/589493 – Tchakabam Oct 15 '18 at 10:48
  • `request` is a pretty [bloated](https://bundlephobia.com/result?p=request) module by today's standards. Folks prefer [cross-fetch](https://www.npmjs.com/package/cross-fetch), which uses the native [Fetch](https://developer.mozilla.org/en-US/docs/Web/API/Fetch_API) implementation if run in a browser context, and brings the same API to Node. – Dan Dascalescu Dec 25 '18 at 05:00
  • 15
    `request` has been deprecated. – Ivan Rubinson Apr 27 '20 at 10:25
  • 1
    deprecated [indeed](https://github.com/request/request/issues/3142) – Klesun Jun 17 '20 at 08:29
29

Make another request based on response.headers.location:

      const request = function(url) {
        lib.get(url, (response) => {
          var body = [];
          if (response.statusCode == 302) {
            body = [];
            request(response.headers.location);
          } else {
            response.on("data", /*...*/);
            response.on("end", /*...*/);
          };
        } ).on("error", /*...*/);
      };
      request(url);
Nakilon
  • 34,866
  • 14
  • 107
  • 142
  • This is the answer if you want to use built in http lib, follow `response.headers.location` – Vidar Jul 23 '19 at 02:21
26

Update:

Now you can follow all redirects with var request = require('request'); using the followAllRedirects param.

request({
  followAllRedirects: true,
  url: url
}, function (error, response, body) {
  if (!error) {
    console.log(response);
  }
});
skozz
  • 2,662
  • 3
  • 26
  • 37
22

Here is my (recursive) approach to download JSON with plain node, no packages required.

import https from "https";

function get(url, resolve, reject) {
  https.get(url, (res) => {
    // if any other status codes are returned, those needed to be added here
    if(res.statusCode === 301 || res.statusCode === 302) {
      return get(res.headers.location, resolve, reject)
    }

    let body = [];

    res.on("data", (chunk) => {
      body.push(chunk);
    });

    res.on("end", () => {
      try {
        // remove JSON.parse(...) for plain data
        resolve(JSON.parse(Buffer.concat(body).toString()));
      } catch (err) {
        reject(err);
      }
    });
  });
}

async function getData(url) {
  return new Promise((resolve, reject) => get(url, resolve, reject));
}

// call
getData("some-url-with-redirect").then((r) => console.log(r));

wiesson
  • 6,544
  • 5
  • 40
  • 68
7

Here is function I use to fetch the url that have redirect:

const http = require('http');
const url = require('url');

function get({path, host}, callback) {
    http.get({
        path,
        host
    }, function(response) {
        if (response.headers.location) {    
            var loc = response.headers.location;
            if (loc.match(/^http/)) {
                loc = new Url(loc);
                host = loc.host;
                path = loc.path;
            } else {
                path = loc;
            }
            get({host, path}, callback);
        } else {
            callback(response);
        }
    });
}

it work the same as http.get but follow redirect.

jcubic
  • 61,973
  • 54
  • 229
  • 402
3

In case of PUT or POST Request. if you receive statusCode 405 or method not allowed. Try this implementation with "request" library, and add mentioned properties.
followAllRedirects: true,
followOriginalHttpMethod: true

       const options = {
           headers: {
               Authorization: TOKEN,
               'Content-Type': 'application/json',
               'Accept': 'application/json'
           },
           url: `https://${url}`,
           json: true,
           body: payload,
           followAllRedirects: true,
           followOriginalHttpMethod: true
       }

       console.log('DEBUG: API call', JSON.stringify(options));
       request(options, function (error, response, body) {
       if (!error) {
        console.log(response);
        }
     });
}
Sanjeet kumar
  • 3,333
  • 3
  • 17
  • 26
-1

If you have https server, change your url to use https:// protocol.

I got into similar issue with this one. My url has http:// protocol and I want to make a POST request, but the server wants to redirect it to https. What happen is that, turns out to be node http behavior sends the redirect request (next) in GET method which is not the case.

What I did is to change my url to https:// protocol and it works.

Kevin F.
  • 57
  • 6
-1

Possibly a little bit of a necromancing post here, but...

here's a function that follows up to 10 redirects, and detects infinite redirect loops. also parses result into JSON

Note - uses a callback helper (shown at the end of this post)

( TLDR; full working demo in context here or remixed-version here)

function getJSON(url,cb){

    var callback=errBack(cb);
    //var callback=errBack(cb,undefined,false);//replace previous line with this to turn off logging

    if (typeof url!=='string') {
        return callback.error("getJSON:expecting url as string");
    }

    if (typeof cb!=='function') {
        return callback.error("getJSON:expecting cb as function");
    }

    var redirs = [url],
    fetch = function(u){
        callback.info("hitting:"+u);
        https.get(u, function(res){
            var body = [];
            callback.info({statusCode:res.statusCode});
            if ([301,302].indexOf(res.statusCode)>=0) {
                if (redirs.length>10) {
                    return callback.error("excessive 301/302 redirects detected");
                } else {
                    if (redirs.indexOf(res.headers.location)<0) {
                        redirs.push(res.headers.location);
                        return fetch(res.headers.location);
                    } else {
                        return callback.error("301/302 redirect loop detected");
                    }
                }
            } else {
              res.on('data', function(chunk){
                  body.push(chunk);
                  callback.info({onData:{chunkSize:chunk.length,chunks:body.length}});
              });
              res.on('end', function(){
                  try {
                      // convert to a single buffer
                      var json = Buffer.concat(body);
                      console.info({onEnd:{chunks:body.length,bodyLength:body.length}});

                      // parse the buffer as json
                      return callback.result(JSON.parse(json),json);
                  } catch (err) {

                      console.error("exception in getJSON.fetch:",err.message||err);

                      if (json.length>32) {
                        console.error("json==>|"+json.toString('utf-8').substr(0,32)+"|<=== ... (+"+(json.length-32)+" more bytes of json)");
                      } else {
                          console.error("json==>|"+json.toString('utf-8')+"|<=== json");
                      }

                      return callback.error(err,undefined,json);
                  }
              });
            }
        });
    };
    fetch(url);   
}

Note - uses a callback helper (shown below)

you can paste this into the node console and it should run as is.

( or for full working demo in context see here )

var 

fs      = require('fs'),
https   = require('https');

function errBack (cb,THIS,logger) {

   var 
   self,
   EB=function(fn,r,e){
       if (logger===false) {
           fn.log=fn.info=fn.warn=fn.errlog=function(){};       
       } else {
           fn.log        = logger?logger.log   : console.log.bind(console);
           fn.info       = logger?logger.info  : console.info.bind(console);
           fn.warn       = logger?logger.warn  : console.warn.bind(console);
           fn.errlog     = logger?logger.error : console.error.bind(console);
       }
       fn.result=r;
       fn.error=e;
       return (self=fn);
   };


   if (typeof cb==='function') {
       return EB(

            logger===false // optimization when not logging - don't log errors
            ?   function(err){
                   if (err) {
                      cb (err);
                     return true;
                   }
                   return false;
               }

            :  function(err){
                   if (err) {
                      self.errlog(err);
                      cb (err);
                     return true;
                   }
                   return false;
               },

           function () {
               return cb.apply (THIS,Array.prototype.concat.apply([undefined],arguments));
           },
           function (err) {
               return cb.apply (THIS,Array.prototype.concat.apply([typeof err==='string'?new Error(err):err],arguments));
           }
       );
   } else {

       return EB(

           function(err){
               if (err) {
                   if (typeof err ==='object' && err instanceof Error) {
                       throw err;
                   } else {
                       throw new Error(err);
                   }
                   return true;//redundant due to throw, but anyway.
               }
               return false;
           },

           logger===false
              ? self.log //optimization :resolves to noop when logger==false
              : function () {
                   self.info("ignoring returned arguments:",Array.prototype.concat.apply([],arguments));
           },

           function (err) {
               throw typeof err==='string'?new Error(err):err;
           }
       );
   }
}

function getJSON(url,cb){

    var callback=errBack(cb);

    if (typeof url!=='string') {
        return callback.error("getJSON:expecting url as string");
    }

    if (typeof cb!=='function') {
        return callback.error("getJSON:expecting cb as function");
    }

    var redirs = [url],
    fetch = function(u){
        callback.info("hitting:"+u);
        https.get(u, function(res){
            var body = [];
            callback.info({statusCode:res.statusCode});
            if ([301,302].indexOf(res.statusCode)>=0) {
                if (redirs.length>10) {
                    return callback.error("excessive 302 redirects detected");
                } else {
                    if (redirs.indexOf(res.headers.location)<0) {
                        redirs.push(res.headers.location);
                        return fetch(res.headers.location);
                    } else {
                        return callback.error("302 redirect loop detected");
                    }
                }
            } else {
              res.on('data', function(chunk){
                  body.push(chunk);
                  console.info({onData:{chunkSize:chunk.length,chunks:body.length}});
              });
              res.on('end', function(){
                  try {
                      // convert to a single buffer
                      var json = Buffer.concat(body);
                      callback.info({onEnd:{chunks:body.length,bodyLength:body.length}});

                      // parse the buffer as json
                      return callback.result(JSON.parse(json),json);
                  } catch (err) {
                      // read with "bypass refetch" option
                      console.error("exception in getJSON.fetch:",err.message||err);

                      if (json.length>32) {
                        console.error("json==>|"+json.toString('utf-8').substr(0,32)+"|<=== ... (+"+(json.length-32)+" more bytes of json)");
                      } else {
                          console.error("json==>|"+json.toString('utf-8')+"|<=== json");
                      }

                      return callback.error(err,undefined,json);
                  }
              });
            }
        });
    };
    fetch(url);   
}

var TLDs,TLDs_fallback = "com.org.tech.net.biz.info.code.ac.ad.ae.af.ag.ai.al.am.ao.aq.ar.as.at.au.aw.ax.az.ba.bb.bd.be.bf.bg.bh.bi.bj.bm.bn.bo.br.bs.bt.bv.bw.by.bz.ca.cc.cd.cf.cg.ch.ci.ck.cl.cm.cn.co.cr.cu.cv.cw.cx.cy.cz.de.dj.dk.dm.do.dz.ec.ee.eg.er.es.et.eu.fi.fj.fk.fm.fo.fr.ga.gb.gd.ge.gf.gg.gh.gi.gl.gm.gn.gp.gq.gr.gs.gt.gu.gw.gy.hk.hm.hn.hr.ht.hu.id.ie.il.im.in.io.iq.ir.is.it.je.jm.jo.jp.ke.kg.kh.ki.km.kn.kp.kr.kw.ky.kz.la.lb.lc.li.lk.lr.ls.lt.lu.lv.ly.ma.mc.md.me.mg.mh.mk.ml.mm.mn.mo.mp.mq.mr.ms.mt.mu.mv.mw.mx.my.mz.na.nc.ne.nf.ng.ni.nl.no.np.nr.nu.nz.om.pa.pe.pf.pg.ph.pk.pl.pm.pn.pr.ps.pt.pw.py.qa.re.ro.rs.ru.rw.sa.sb.sc.sd.se.sg.sh.si.sj.sk.sl.sm.sn.so.sr.st.su.sv.sx.sy.sz.tc.td.tf.tg.th.tj.tk.tl.tm.tn.to.tr.tt.tv.tw.tz.ua.ug.uk.us.uy.uz.va.vc.ve.vg.vi.vn.vu.wf.ws.ye.yt.za.zm.zw".split(".");
var TLD_url = "https://gitcdn.xyz/repo/umpirsky/tld-list/master/data/en/tld.json";
var TLD_cache = "./tld.json";
var TLD_refresh_msec = 15 * 24 * 60 * 60 * 1000;
var TLD_last_msec;
var TLD_default_filter=function(dom){return dom.substr(0,3)!="xn-"};


function getTLDs(cb,filter_func){

    if (typeof cb!=='function') return TLDs;

    var 
    read,fetch,
    CB_WRAP=function(tlds){
        return cb(
            filter_func===false
            ? cb(tlds)
            : tlds.filter(
                typeof filter_func==='function'
                 ? filter_func
                 : TLD_default_filter)
            );
    },
    check_mtime = function(mtime) {
       if (Date.now()-mtime > TLD_refresh_msec) {
           return fetch();
       } 
       if (TLDs) return CB_WRAP (TLDs);
       return read();
    };

    fetch = function(){

        getJSON(TLD_url,function(err,data){
            if (err) {
                console.log("exception in getTLDs.fetch:",err.message||err);
                return read(true);      
            } else {
                TLDs=Object.keys(data);

                fs.writeFile(TLD_cache,JSON.stringify(TLDs),function(err){
                    if (err) {
                        // ignore save error, we have the data
                        CB_WRAP(TLDs);
                    } else {
                        // get mmtime for the file we just made
                        fs.stat(TLD_cache,function(err,stats){
                            if (!err && stats) {
                               TLD_last_msec = stats.mtimeMs; 
                            }
                            CB_WRAP(TLDs);    
                        });
                    }
                });
            }
        });
    };

    read=function(bypassFetch) {
        fs.readFile(TLD_cache,'utf-8',function(err,json){

            try {
                if (err) {

                    if (bypassFetch) {
                        // after a http errror, we fallback to hardcoded basic list of tlds
                        // if the disk file is not readable
                        console.log("exception in getTLDs.read.bypassFetch:",err.messsage||err);    

                        throw err;
                    }
                    // if the disk read failed, get the data from the CDN server instead
                    return fetch();
                }

                TLDs=JSON.parse(json);
                if (bypassFetch) {
                    // we need to update stats here as fetch called us directly
                    // instead of being called by check_mtime
                    return fs.stat(TLD_cache,function(err,stats){
                        if (err) return fetch();
                        TLD_last_msec =stats.mtimeMs;
                        return CB_WRAP(TLDs);
                    });
                }

            } catch (e){
                // after JSON error, if we aren't in an http fail situation, refetch from cdn server
                if (!bypassFetch) {
                    return fetch();
                }

                // after a http,disk,or json parse error, we fallback to hardcoded basic list of tlds

                console.log("exception in getTLDs.read:",err.messsage||err);    
                TLDs=TLDs_fallback;
            }

            return CB_WRAP(TLDs);
        });
    };

    if (TLD_last_msec) {
        return check_mtime(TLD_last_msec);
    } else {
        fs.stat(TLD_cache,function(err,stats){
            if (err) return fetch();
            TLD_last_msec =stats.mtimeMs;
            return check_mtime(TLD_last_msec);
        });
    }
}

getTLDs(console.log.bind(console));
unsynchronized
  • 4,828
  • 2
  • 31
  • 43