I'm trying to use httplib to check if each url in a list of 30k+ websites still works. Each url is read in from a .csv file, and into a matrix, and then that matrix goes through a for-loop for each url in the file. Afterwards, (where my problem is), I run a function, runInternet(url), which takes in the url string, and returns true if the url works, and false if it doesn't. I've used this as my baseline, and have also looked into this. While I've tried both, I don't quite understand the latter, and neither works...
def runInternet(url):
try:
page = httplib.HTTPConnection(url)
page.connect()
except httplib.HTTPException as e:
return False
return True
However, afterwards, all the links are stated as broken! I randomly chose a few that worked, and they work when I input them into my browser...so what's happening? I've narrowed down the problem spot to this line: page = httplib.HTTPConnection(url)
Edit: I tried inputting 'www.google.com' in exchange for the url, and the program works, and when I try printing e, it says nonnumeric port...