request returns 403 in python beautifulsoup

Question

I am using beautiful soup to try to parse information from a webpage:

url='https://www.onthemarket.com/for-sale/2-bed-flats-apartments/shortlands-station/?max-bedrooms=&radius=0.5'
req=requests.get(url)

req returns <Response [403]>

Python requests. 403 Forbidden suggests there is a user-agent issue, but I cannot find it in my instance.

Are there any suggestions

I notice the header `cookie: logglytrackingsession=` being set in the request. The server likely denies requests without a tracking cookie, which get set when loaded in a browser. — clubby789, Oct 14 '19 at 21:51
It could be what @JammyDodger mentions, it could be the user agent you mentioned, check the headers your browser sends when accessing the site. — luis.parravicini, Oct 14 '19 at 21:53

score 0 · Answer 1 · answered Oct 14 '19 at 21:57

In such case so please use headers which include user-agent

from bs4 import BeautifulSoup
import requests


url = 'https://www.onthemarket.com/for-sale/2-bed-flats-apartments/shortlands-station/?max-bedrooms=&radius=0.5'

headers = {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3440.84 Safari/537.36',
    'accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8',
}

html_page = requests.get(url, headers=headers).text
soup = BeautifulSoup(html_page, "html.parser")

print(soup.text)

It isn't always a user agent issue, what to do then? – SarahJessica Sep 06 '20 at 15:05 — SarahJessica, Sep 06 '20 at 15:05

request returns 403 in python beautifulsoup

1 Answers1