Python requests - 403 forbidden - despite setting `User-Agent` headers

Question

import requests
import webbrowser
from bs4 import BeautifulSoup

url = 'https://www.gamefaqs.com'
#headers={'User-Agent': 'Mozilla/5.0'}    
headers ={'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 Safari/537.36'}


response = requests.get(url, headers)

response.status_code is returning 403. I can browse the website using firefox/chrome, so It seems to be a coding error.

I can't figure out what mistake I'm making.

Thank you.

score 8 · Accepted Answer · answered Jul 13 '17 at 16:39

8

This works if you make the request through a Session object.

import requests

session = requests.Session()
response = session.get('https://www.gamefaqs.com', headers={'User-Agent': 'Mozilla/5.0'})

print(response.status_code)

Output:

answered Jul 13 '17 at 16:39

cs95

379,657
97
704
746

Thanks. What exactly is going on with the `Session` object that is making the difference? I've never had to make a `Session` object to scrape a site. – Moondra Jul 13 '17 at 16:49
1

@Moondra The main thing about Session objects is its compatibility with cookies. For all you know, it's possible the site is setting and requesting cookies to be echoed back as a defence against scraping which is probably against its policy. – cs95 Jul 13 '17 at 16:51
Cookies. I see. Thank you. – Moondra Jul 13 '17 at 19:39
6

I've tried this for another website and it doesn't fix the issue, I still get a 403. – SarahJessica Sep 06 '20 at 14:59
Same here, I'd like to learn if you've found a solution? @SarahJessica – talha06 Apr 14 '21 at 19:37
It was a while ago, I can't remember @talha06. Sorry – SarahJessica Apr 14 '21 at 20:50

score 3 · Answer 2 · answered Jun 07 '18 at 11:25

3

Using keyword argument works for me:

import requests
headers={'User-Agent': 'Mozilla/5.0'}
response = requests.get('https://www.gamefaqs.com', headers=headers)

answered Jun 07 '18 at 11:25

Stephen

31
3

score 1 · Answer 3 · answered Aug 21 '21 at 14:37

Try using a Session.

import requests
session = requests.Session()
response = session.get(url, headers={'user-agent': 'Mozilla/5.0'})
print(response.status_code)

If still the request returns 403 Forbidden (after session object & adding user-agent to headers), you may need to add more headers:

headers = {
    'user-agent':"Mozilla/5.0 ...",
    'accept': '"text/html,application...',
    'referer': 'https://...',
}
r = session.get(url, headers=headers)

In the chrome, Request headers can be found in the Network > Headers > Request-Headers of the Developer Tools. (Press F12 to toggle it.)

reason being, few websites look for user-agent or for presence of specific headers before accepting the request.

Python requests - 403 forbidden - despite setting `User-Agent` headers

3 Answers3

Linked