CODE :
def ValidateProxy(LIST_PROXIES):
'''
Checks if scraped proxies allow HTTPS connection
'''
for proxy in LIST_PROXIES:
print('using', proxy)
host, port = str(proxy).split(":")
try:
resp = requests.get('https://amazon.com',
proxies=dict(https=f'socks5://{host}:{port}'),
timeout=6)
except ConnectionError:
print(proxy, 'REMOVED')
LIST_PROXIES.remove(proxy)
print(len(LIST_PROXIES), 'PROXIES GATHERED')
if len(LIST_PROXIES) != 0:
return LIST_PROXIES
else:
return None
INPUT :
['46.4.96.137:1080', '138.197.157.32:1080', '138.68.240.218:1080'.....] #15 proxies
OUTPUT :
using 46.4.96.137:1080
46.4.96.137:1080 REMOVED
using 138.68.240.218:1080
138.68.240.218:1080 REMOVED
using 207.154.231.213:1080
207.154.231.213:1080 REMOVED
using 198.199.120.102:1080
198.199.120.102:1080 REMOVED
using 88.198.24.108:1080
88.198.24.108:1080 REMOVED
using 188.226.141.211:1080
188.226.141.211:1080 REMOVED
using 92.222.180.156:1080
92.222.180.156:1080 REMOVED
using 183.233.183.70:1081
183.233.183.70:1081 REMOVED
7 PROXIES GATHERED # len(LIST_PROXIES) == 7, so 8 are removed which are printed above
MY DOUBTS :
Why
print('using', proxy)
is not getting executed everytime ? (becuase input list has 15 items and this line is printed only 8 times)Are try and except both blocks getting executed everytime ? Becuase everytime
REMOVED
is printed on console.I want to function it like
print('using', proxy)
for every proxy and ifConnectionError
thenprint(proxy, 'REMOVED')
and remove that proxy from list.
EDIT : FULL INPUT
['46.4.96.137:1080', '138.197.157.32:1080', '138.68.240.218:1080', '162.243.108.129:1080', '207.154.231.213:1080', '176.9.119.170:1080', '198.199.120.102:1080', '176.9.75.42:1080', '88.198.24.108:1080', '188.226.141.61:1080', '188.226.141.211:1080', '125.124.185.167:38801', '92.222.180.156:1080', '188.166.83.17:1080', '183.233.183.70:1081']