-1

I have this problem: I'm trying to create a script in Python to download a web site and look for some info.

this is the code:

import urllib.request

url_archive_of_nethys = "http://www.aonprd.com/Default.aspx"


def getMainPage():
    fp = urllib.request.urlopen(url_archive_of_nethys)
    mybytes = fp.read()
    mystr = mybytes.decode("utf8")
    fp.close()
    print(mystr)



def main():
    getMainPage()


if __name__ == "__main__":
    main()

but when I start it I get:

 <HTTPError 999: 'No Hacking'>

I also tried to use curl command:

curl http://www.aonprd.com/Default.aspx

and i downloaded the page correctly

I'm developing using Visual Studio and python 3.6

Any suggest will be appreciated thank you

Valgio
  • 21
  • 6
  • 1
    Possible duplicate of [999 Error Code on HEAD request to LinkedIn](https://stackoverflow.com/questions/27231113/999-error-code-on-head-request-to-linkedin) – petezurich Sep 29 '18 at 15:02
  • `def main(): getMainPage()` this is pointless, python doesn't require you to create a `main()` entry-point – roganjosh Sep 29 '18 at 15:14

1 Answers1

1

they probably detect your user-agent and filter you. try to change it:

req = urllib.request.Request(
        url, 
        data=None, 
        headers={'User-Agent': ("Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_3) "
                                "AppleWebKit/537.36 (KHTML, like Gecko) " 
                                "Chrome/35.0.1916.47 Safari/537.36")})
fp = urllib.request.urlopen(req)
SocketPlayer
  • 166
  • 6