Im trying to build a python script that is checking all of the links in the file.
It opens a file, reads the sites, and prints them but it doesn't print their statuses and I cant wrap my head around it. It has no syntax errors, all libraries are loaded, function is called.
Code:
import urllib.request
import urllib.error
import time
from multiprocessing import Pool
start = time.time()
file = open('list.txt', 'r', encoding="ISO-8859-1")
urls = file.readlines()
print(urls)
def checkurl(url):
try:
conn = urllib.request.urlopen(url)
except urllib.error.HTTPError as e:
# Return code error (e.g. 404, 501, ...)
# ...
print('HTTPError: {}'.format(e.code) + ', ' + url)
except urllib.error.URLError as e:
# Not an HTTP-specific error (e.g. connection refused)
# ...
print('URLError: {}'.format(e.reason) + ', ' + url)
else:
# 200
# ...
print('good' + ', ' + url)
if __name__ == "__main__":
p = Pool(processes=20)
result = p.map(checkurl, urls)
print("done in : ", time.time()-start)
The output is
Python 3.9.1 (tags/v3.9.1:1e5d33e, Dec 7 2020, 17:08:21) [MSC v.1927 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license()" for more information.
>>>
===================== RESTART: C:\Users\Anthon\Desktop\b.py ====================
['http://google.com\n', 'http://yahoo.com\n', 'http://thissitedoeesntexistpapgojwpgoajwpogap.com']
done in : 1.954719066619873
>>>