1

I'm trying to write program, that will check which proxies are active. When my script tries to connect to inactive proxies, it takes up to even about 30 seconds. When I check list of thousands proxies it increases the working time of the script by a few hours.

Is it possible to break this function if it takes more than 5 seconds to respond?

def get(url, proxy):
proxies = {
    'http': 'http://'+proxy,
    'https': 'https://'+proxy
}
s = requests.Session()
s.headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36'}
s.proxies = proxies
r = s.get(url)
return [r.status_code, r.reason, r.text]

with open('proxy.txt') as ips:
for ip in ips:
    ip = ip.split('\n', 1)[0]
    try:
        get(url, ip)
        with open('working.txt', 'a') as the_file:
            the_file.write(ip+'\n')
    except:
        print("error")

Thank you.

user207421
  • 305,947
  • 44
  • 307
  • 483
Artur Nawrot
  • 473
  • 1
  • 8
  • 19

1 Answers1

1

Use the timeout kwarg with s.get. s.get(url, timeout=5)

MoxieBall
  • 1,907
  • 7
  • 23