0

I am using python's built in socket and multiprocessing libaries to scan tcp ports of a host. I know my first function works, and I am just trying to make it work with multriprocess Queue and Process, not sure where I am going wrong.

If I remove the Queue everything seems to complete, I just actually need to get the results from it.

from multiprocessing import Process, Queue
import socket

def tcp_connect(ip, port_number):
    try:
        scanner = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
        scanner.settimeout(0.1)
        scanner.connect((str(ip), port_number))
        scanner.close()

        #put into queue
        ## Remove line below if you 
        q.put(port_number)
    except:
        pass

RESULTS = []
list_of_numbs = list(range(1,501))

for numb in list_of_numbs:

    #make my queue
    q = Queue()
    p = Process(target=tcp_connect, args=('google',numb))
    p.start()
    #take my results from my queue and append to a list
    RESULTS.append(q.get())
    p.join()

print(RESULTS)

I would just like it to print out port numbers that were open. Right now since it is scanning google.com it should really only return 80 and 443.

EDIT: This would work if I used Pool but I went to Process and Queue is because the bigger piece of this runs in Django with celery and the don't allow Daemon when executing with Pool

Mellivice
  • 1
  • 2
  • 1
    Why are you creating a new queue for every subprocess? Why not use a single queue? – Tom Dalton Jul 26 '19 at 13:40
  • Also, you aren't passing the queue into your subprocess's function. – Tom Dalton Jul 26 '19 at 13:40
  • To be clear, what OS are you on? I'm assuming not Windows (pretty sure this code would die in short order on Windows), but you need to be explicit with `multiprocessing` questions, where the behavior differs dramatically. Knowing if you're on Python 2 or 3 (and which minor version) may also be useful. – ShadowRanger Jul 26 '19 at 13:44
  • Also important: [Why is “except: pass” a bad programming practice?](https://stackoverflow.com/q/21553327/364696). If you're having problems of any kind in your `tcp_connect` function, you'll *never* see *any* evidence of them except for the queue not being populated. – ShadowRanger Jul 26 '19 at 13:45
  • Sorry yes I can move the queue out of the for loop. OS is Ubuntu Python 3.7 – Mellivice Jul 26 '19 at 13:47
  • 1
    Offtopic advice: you shouldn't scan Google or similar services since they may blacklist your IP for a while. You might see something like ["Unusual traffic from your computer network" captcha](https://img.devrant.com/devrant/rant/r_1609925_oaCoi.jpg) every time visiting Google in your browser which will be kinda annoying. – asikorski Jul 26 '19 at 13:52
  • You should also add an `if __name__ == '__main__':` as advised in the "Safe importing of main module" section of the [documention](https://docs.python.org/3/library/multiprocessing.html#the-spawn-and-forkserver-start-methods). – martineau Jul 26 '19 at 14:06
  • @martineau: Yeah, the lack of that guard is what made me fairly sure this was a non-Windows machine. :-) – ShadowRanger Jul 26 '19 at 15:24

1 Answers1

0

For work like this, a multiprocessing.Pool would be a better way of handling it.

You don't have to worry about creating Processes and Queues; all that is done for you in the background. Your worker function only has to return a result, and that will be transported to the parent process for you.

I would suggest using multiprocessing.Pool.imap_unordered(), because it starts returning results as soon as it is available.

One thing; the worker process takes only one argument. It you need multiple different arguments for each call; wrap them in a tuple. If you have arguments that are the same for all calls, use functools.partial.


A slightly more modern aproach would be to use an Executor.map() method from concurrent.futures. Since your work consists mainly of socket calls, you could use a ThreadPoolExecutor here, I think. That should be slightly less resource-intensive than a ProcessPoolExecutor.

Roland Smith
  • 42,427
  • 3
  • 64
  • 94
  • Yes Pool does work much better for this however the reason I went to Process and Queue is because the bigger piece of this runs in Django with celery and the don't allow Daemon when executing with Pool – Mellivice Jul 26 '19 at 14:45
  • @Mellivice You should definitely add that information to your question... – Roland Smith Jul 26 '19 at 14:49
  • 2
    @Mellivice: Don't *allow* daemon? Or the daemon processes get terminated if the connection handler process ends? If the latter, the fix is make sure you've received all your results from the `Pool`, and `join` the `Pool` after `close`ing/`terminate`ing it to ensure it's fully cleaned up. – ShadowRanger Jul 26 '19 at 15:27