-1

I want to write simple port scanner. Do it like follows:

class Scanner(object):
    def __init__(self, addr=None):
        self._addr = addr if addr else '127.0.0.1'
        self._lock = Lock()
        self._opened_ports = []

    def __enter__(self):
        return self

    def __exit__(self, exc_type, exc_val, exc_tb):
        return True if not exc_val else False

    def _scan_ports(self, thread_num):    
        from_port = 1000 * thread_num
        ports_count = 1000

        for port in range(ports_count):
            try:
                sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
                sock.connect((self._addr, from_port + port))
                with self._lock:
                    self._opened_ports.append(from_port + port)
            except:
                pass
            finally:
                sock.close()

    def scan(self, num_threads=None):
        if not num_threads:
            num_threads = len(os.sched_getaffinity(0))
        pool = ThreadPool(num_threads)

        thread_numbers = [thread_num for thread_num in range(num_threads)]

        pool.map(self._scan_ports, thread_numbers)
        pool.close()
        pool.join()

        return sorted(self._opened_ports)

As one can see there I'm trying to connect sockets to ports in parallel. So I run it in multithreaded mode like this:

with Scanner() as scanner:
    ports = scanner.scan()

On my core i7 with 8 cores it occupies about 200 ms. If I do the same on 1 thread like this:

with Scanner() as scanner:
    ports = scanner.scan(1)

it occupies 100 ms.

I don't understand why I get such result. Why scanning 1000 ports in 1 thread requires 2 times less time then scanning 1000 ports in 8 threads? Explain me please.

UPD if I replace socket stuff with simple time.sleep it will work as i expect - multithreaded mode occupies the same time as singlethreaded, so what is bad with sockets?

Cœur
  • 37,241
  • 25
  • 195
  • 267
borune
  • 548
  • 5
  • 21
  • The multithreaded version is doing eight times as much work, in only twice the time. Why do you think this is a problem? – jasonharper Apr 09 '19 at 16:37
  • @jasonharper I think that times should be about the same. Multithreaded version should require little more time due to context switching. But no twice. If I replace socket stuff with time.sleep the behavior will be exaclty what i'm expecting – borune Apr 09 '19 at 17:34
  • See https://stackoverflow.com/questions/1294382/what-is-the-global-interpreter-lock-gil-in-cpython – adrtam Apr 09 '19 at 20:48
  • @adrtam i heard about GIL, but it doesn't explain why replacing sockets routine with sleep changes behaviour – borune Apr 10 '19 at 07:18

1 Answers1

0

Seems it is localhost issue. If i scan any other IP, not localhost, it works better.

borune
  • 548
  • 5
  • 21