1

I am trying get a page's HTML (on someone else's site) with an Axios GET request from my node server. But the request is returning a 403 error. The same request works on Postman.

 axios(
      'https://www.ssense.com/en-ca/women/product/laura-lombardi/gold-cable-chain-necklace/6378111',
      {
        headers: { // tried to fake the user-agent but this didn't change anything
          'User-Agent':
            'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2883.95 Safari/537.36',
        },
      }
    )
      .then((result) => {
        console.log(result);
      })
      .catch((err) => {
        console.log(err);
      });

Some have suggested that this is a CORS issue. This confuses me. I thought cors was something that only effects browsers. If it is a CORS issue, why would it effect axios? Since I have no control over the site's server, I of course can't change their cors settings. So what is the solution?

Why can't I get the response in axios like I can in Postman? Is there some a way to get the HTML in node like postman does?

Dashiell Rose Bark-Huss
  • 2,173
  • 3
  • 28
  • 48

3 Answers3

1

Postman, as a developer tool and in contrast as browsers, does not enforce CORS. That is the reason why it works on Postman and why it does not work on a browser. I am not entirely sure about your context, but sometimes, the solution when we can not control the server or API we are requesting and getting CORS issues is using a proxy. Like this one, for example.

Dharman
  • 30,962
  • 25
  • 85
  • 135
rd05
  • 64
  • 4
1

Because you have to send exact headers whatever server sends to the browser. I can solve this issue using scrapy . Thanks –

Md. Fazlul Hoque
  • 15,806
  • 5
  • 12
  • 32
  • exact headers? what headers? the headers that the site I'm trying to scrape sends to the browser? – Dashiell Rose Bark-Huss Jul 15 '21 at 22:33
  • Go to network tap from devtools then go to xhr tab and you will see many requests are sent by the server and among them you have to find out the exact url and click( in name column) and click the header and you will seen response headers, and request headers . You need only Request headers fields. Thanks – Md. Fazlul Hoque Jul 15 '21 at 22:49
  • ahh so like copy the request a browser would send? – Dashiell Rose Bark-Huss Jul 15 '21 at 23:16
-1

Adding mode: 'no-cors' to the headers made it intermittently work but I don't even know if that's a real header so I think it was just a coincidence (I think it's an axios option not a header). I then realized that Postman was also giving me 403.

Upon closer look the html: "Access to this page has been denied because we believe you are using automation tools to browse the website"

Says it can happen because:

  • javascript is disabled or blocked by an extension (ad blockers for example)
  • Your browser does not support cookies

Switching to puppeteer since puppeteer can load javascript, not sure axios can. But still had the issue. Going to look into this stackoverflow question How to avoid being detected as bot on Puppeteer and Phantomjs?

Dashiell Rose Bark-Huss
  • 2,173
  • 3
  • 28
  • 48