I am trying to write a python (version 2.7.5) CGI script on a Centos7 server.
My script attempt to download data from librivox's webpage like ... https://librivox.org/selections-from-battle-pieces-and-aspects-of-the-war-by-herman-melville/
and my script bombs out with this error:
<class 'urllib2.URLError'>: <urlopen error [Errno 13] Permission denied>
args = (error(13, 'Permission denied'),)
errno = None
filename = None
message = ''
reason = error(13, 'Permission denied')
strerror = None
I have shutdown iptables
I can do things like `wget -O- https://librivox.org/selections-from-battle-pieces-and-aspects-of-the-war-by-herman-melville/' without error. Here is the bit of code were the error occurs:
def output_html ( url, appname, doobb ):
print "url is %s<br>" % url
soup = BeautifulSoup(urllib2.urlopen( url ).read())
Update: Thanks Paul and alecxe I have updated my code to be like so:
def output_html ( url, appname, doobb ):
#hdr = {'User-Agent':'Mozilla/5.0'}
#print "url is %s<br>" % url
#req = url2lib2.Request(url, headers=hdr)
# soup = BeautifulSoup(urllib2.urlopen( url ).read())
headers = {'User-Agent':'Mozilla/5.0'}
# headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.99 Safari/537.36'}
response = requests.get( url, headers=headers)
soup = BeautifulSoup(response.content)
... and I get a slightly different error when ...
response = requests.get( url, headers=headers)
... gets called ...
<class 'requests.exceptions.ConnectionError'>: ('Connection aborted.', error(13, 'Permission denied'))
args = (ProtocolError('Connection aborted.', error(13, 'Permission denied')),)
errno = None
filename = None
message = ProtocolError('Connection aborted.', error(13, 'Permission denied'))
request = <PreparedRequest [GET]>
response = None
strerror = None
... the funny thing is wrote a command line version of this script and it works fine and looks something like this ...
def output_html ( url ):
soup = BeautifulSoup(urllib2.urlopen( url ).read())
Very strange don't you think?
Update: This question may already have an answer here: urllib2.HTTPError: HTTP Error 403: Forbidden 2 answers
NO THEY DO NOT ANSWER THE QUESTION