1

I'm trying to check this site for changes on the title: https://allestoringen.nl/storing/kpn/

source = requests.get(url).text
soup = bs4.BeautifulSoup(source,'html.parser')
event_string = str(soup.find(text='Er is geen storing bij KPN'))

print (event_string)

However, event_string returns None every time.

MendelG
  • 14,885
  • 4
  • 25
  • 52
ironman
  • 11
  • 1
  • 1
    Would you mind posting the relevant html from the site? – Tim Feb 21 '21 at 21:46
  • If you can't find it then it's not there. The text on the site for me is: "Mogelijke storing bij KPN" – forgetso Feb 21 '21 at 21:47
  • Did you check if there was a 'storing' when you ran your code? – Jonas Feb 21 '21 at 21:47
  • This is your problem (and answer): https://stackoverflow.com/questions/41946166/requests-get-returns-403-while-the-same-url-works-in-browser – Mark H Feb 21 '21 at 21:59

1 Answers1

1

The reason you don't get a result might be that the website doesn't accept your request. I got this result.

page = requests.get(url)

page.status_code  # 403
page.reason       # 'Forbidden'

You might want to take a look at this post for a solution.

It is always a good idea to check the return status of your request in your code.

But to solve your problem. You might want to check the <title> element instead of a specific string.

# stolen from the post I mentioned
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.102 Safari/537.36'}

page = requests.get(url, headers=headers)
page.status_code  # 200. Adding a header solved the problem.
soup = bs4.BeautifulSoup(page.text,'html.parser')

# get title.
print(soup.find("title").text)
'KPN storing? Actuele storingen en problemen | Allestoringen'
marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
Jonas
  • 1,401
  • 9
  • 25