Solved using the answer from QHarr!
Trying to extract some information (starting with the title) from a website.
The code below works fine with http://google.com
, but not with the link i need (url
).
Error code: "HTTP Error 500: Internal Server Error"
Am I doing something wrong? Is it possible to do this another way?
from urllib.request import urlopen
import urllib.error
import bs4
import time
url = "http://st.atb.no/New/minskjerm/FST.aspx?visMode=1&cTit=&c1=1&s1=16011301&sv1=&cn1=&template=2&cmhb=FF6600&cmhc=00FF00&cshb=3366FF&cshc=FFFFFF&arb=000000&rows=1&period=&"
for i in range(5): #Try 5 times to reach page
try:
html = urlopen(url)
except urllib.error.HTTPError as exc:
print('Error code: ', exc)
time.sleep(1) # wait 10 seconds and then make http request again
continue
else:
print('Success')
break
soup = bs4.BeautifulSoup(html, 'lxml')
title = soup.find('title')
print(title.getText())