1

I have a little script which filters those domain names which are not registred yet. I use pywhois module. The problem is that it suddenly freeze and do nothing after several (sometimes hundreds) of requests. I think it is not a ban because I can run the program right after freeze and it works.

I would like to avoid this freezing. My idea is to count runtime of the function and if the time cross some line (for example 10 seconds) it repeats the code.

Do you have any advice how to avoid the freezing? Or the better way to check domains?

Here is the code:

for keyword in keywords:
        try:
            details = pythonwhois.get_whois(keyword+'.com')
        except Exception as e:
            print e
            continue
        if 'status' not in details.keys():
            print 'Free domain!'
            print keyword
Milano
  • 18,048
  • 37
  • 153
  • 353
  • may be add time.sleep at the start of for loop – Ajay May 24 '15 at 18:17
  • @Ajay The issue is probably not load, but registered domain names whose servers are improperly responding. – JDong May 24 '15 at 18:28
  • @JDong And have you any idea how to jump to next keyword? Thanks – Milano May 24 '15 at 18:40
  • This is the sort of problem when threading can come in useful as you can asynchronously run your whois lookup without blocking the rest of your program. – mpursuit May 24 '15 at 19:04
  • @mpursuit The whole program is only the lookup so how do you think it? Should I create for each keyword new thread or? Thanks – Milano May 24 '15 at 19:08
  • maybe not threads but async something like https://github.com/tkudla/tornado-whois – kwarunek May 24 '15 at 19:34
  • Bulk querying whois servers will get you rate limited, tar-pited, blackholed and/or blacklisted. This would explain the freeze. You should make sure to read the TOS of the service you use, and apply locally a delay in order not to make too many requests. – Patrick Mevzek Jan 02 '18 at 19:55

2 Answers2

4

This method is prone to change (if the underlying library changes), however, you can call internal socket functions to set a timeout for all pythonwhois network calls. For example:

TIMEOUT = 5.0 # timeout in seconds
pythonwhois.net.socket.setdefaulttimeout(TIMEOUT)
pythonwhois.get_whois("example.com")
amlweems
  • 66
  • 2
0

Maybe you could try dnspython. It looks like you just want to check if a domain name is registered. For example:

import dns.resolver

for keyword in keywords:
    try:
        dns.resolver.query(keyword+'.com')
    except dns.resolver.NXDOMAIN:
        print(keyword+'.com is available!')

DNS resolver has a default timeout of 2 seconds. If you want to change that, you can make a new instance of dns.resolver.Resolver with a different timeout.

To make it multithreaded, a thread pool would be the best choice if you can use python3:

from multiprocessing import Pool

def check_keyword(keyword):
    try:
        dns.resolver.query(keyword+'.com')
    except dns.resolver.NXDOMAIN:
        # You probably want to change this to a return
        print(keyword+'.com is available!') 

if __name__ == '__main__':
    keywords = [...]
    p = Pool(5)
    print(p.map(check_keyword, keywords))
JDong
  • 2,304
  • 3
  • 24
  • 42
  • Thank you for your response but this does not work properly. For example 321000.com is said to be not registered yet but it is. – Milano May 24 '15 at 20:35
  • 1
    hmm, looks like you must use `whois`. I will look for a whois python library with timeouts. – JDong May 24 '15 at 20:42