3

here is a piece of code:

    while (int(currentMemoryUsage[2]) < 50 and int(current_cpu_usage)<80):
        self.event_list.append(Event())
        p = Process(target=self.classA.functionB,
                    args=(a, b, self.lock, self.event_list[counter]))
        print('create process')
        self.process_list.append(p)
        self.event_list[counter].clear()
        counter += 1
        print("about to start process")
        p.start()
        print("new process started")

After start a number of processes, it will stuck at p.start(), it will successfully create new process, but stuck at p.start(), does anyone know why this happen?

EDIT 1: at first I thought it stuck at p.start() because the new process started are not printed out and I see there are some zombie processes in the system. But after I add signal.alarm(180) before p.start() to force it jump over the blocked line, I am still not seeing the lines after that get executed. The edited code is shown below.

while True:
    if (int(current_memory_usage[2]) < 50 and int(current_cpu_usage)<80):
        try:
            self.event_list.append(Event())
            p = Process(target=self.classA.functionB,
                        args=(a, b, self.lock, self.event_list[counter]))
            print('create process')
            self.process_list.append(p)
            self.event_list[counter].clear()
            counter += 1
            print("about to start process")
            signal.alarm(180)
            p.start()
            print("new process started")
            signal.alarm(0)
         except Exception as e:
            print("catch an exception" + str(e))
    print(" I am just a line in the while") 

In other words, I not seeing "new process started" (but the child process did start), nor "catch an exception", nor " I am just a line in the while". I guess it might be the main process exit somehow, but I am not getting any error why the main process quits. Anyone has idea?

1a1a11a
  • 1,187
  • 2
  • 16
  • 25
  • ignore variable naming problem above, I forgot to change it. – 1a1a11a Mar 13 '16 at 18:32
  • 1
    I think more information is need before this can be answered. Are you updating `currentMemoryUsage` and `current_cpu_usage` after you create each process? How many processes are created before it fails? Are you on a Unix-like system or Windows? Have you tried using a `multiprocessing.Pool` object instead of creating each process directly? – bnaecker Mar 13 '16 at 19:14
  • Thanks@bnaecker, I update the usage info in the loop, I forgot to attach it. I am running it on Ubuntu 14.04LTS and I tested on 5 machines, each failed at different number of processes, some at 30, some at 80. I haven't tried `multiprocessing.Pool`, let me have a try. – 1a1a11a Mar 13 '16 at 19:47
  • Also in the loop I checked whether the process is alive or not and process join in the loop, which is not listed here. – 1a1a11a Mar 13 '16 at 19:50
  • @bnaecker, I read the documentation, it seems `multiprocessing.Pool` is not easy to use here, because it only accepts certain functions. – 1a1a11a Mar 13 '16 at 19:54
  • And I say it stuck there, not quit, because 1. I didn't see any error message about exit 2. All the finished child processes become zombie instead of orphan. – 1a1a11a Mar 13 '16 at 19:59
  • [This simplified edit of your code](http://pastebin.com/qJ5GDhgw) ran perfectly on an Ubuntu 14.04 LTS machine. Maybe useless, maybe not. Just wanted to play around with multiprocessing and saw your question :) – jDo Mar 13 '16 at 20:54
  • @jDo Thanks for helping. I also have a similar version of simplified code for testing multiprocessing part, it also works fine, but if I put my whole code (which itself runs fine) into it, it will cause trouble, but the whole code is too much to read and I don't know what part can be removed before paste here. – 1a1a11a Mar 13 '16 at 21:43
  • @1a1a11a OK, sounds tricky... If it's not private/restricted by a NDA or something, you could just use pastebin – jDo Mar 13 '16 at 21:53
  • @jDo Hi, here is the code, http://pastebin.com/73h4pyAj, the method being called in multiprocess is `buildWebsiteTreeMultiThread` under `websiteTreeBuilder`, I delete some of the code to make it more clear, hope this won't cause any trouble. – 1a1a11a Mar 14 '16 at 00:38
  • I am not quite sure whether it stuck there or just quit, because I am seeing the message from child process (which `p.start()` starts), and also I add `signal.alarm` before `p.start()` to force it jump over the blocked line,but it seems, the code after that still does not get chance to execute, so it might be the main process exit, but why does it exit? – 1a1a11a Mar 14 '16 at 03:00
  • A good rule of thumb: Refactor your code so that you can map the process function on argument list. Then first iterate through a for loop to make sure your process function and main script work, without the complexity of parallelism. Then simply change your loop to `pool.map` or in for loop of `p.start(); plst.append(p)` and `[p.join() for p in plst]` – Patrick the Cat Mar 14 '16 at 03:29
  • I just noticed, I hope you did `join` the child processes without letting parent dies. – Patrick the Cat Mar 14 '16 at 03:32
  • `buildWebsiteTreeMultiThread` should be `buildWebsiteTreeMultiProcess`. Multiprocessing and multithreading have profound differences. – Patrick the Cat Mar 14 '16 at 03:35
  • @Mai Thank you for your kind advice! 1. I do need to refactor my code, it is almost a mess now. 2. I am trying to do a load balance thing, fetch a url from server when current worker machine has enough memory and CPU power, so I cannot pre-map the url to process 3. I do join all the children after the while loop. 4. I think `buildWebsiteTreeMultiThread` should be thread, basically, I have a main process keep fetching url from server and start a new process to deal with the fetched url, in each started process, I use multithread to do the crawling. Maybe I am wrong? Please help me. Thank you! – 1a1a11a Mar 14 '16 at 04:16
  • Sorry, I didn't look at your code, so I was wrong about your naming. If you are doing threading, that's fine. On the other hand, I would suggest you to get a sequential version of your code to work first, then make it run on multiple processes, then add more threads on each process. For now it's hard to say what's wrong. For future development please remember to push out minimal working increments. It's a good practice in general. – Patrick the Cat Mar 14 '16 at 04:25
  • Thank you for your advice. When running as a single process, it works fine and even as multiprocess, the problem does not happen at first, it happens at some irregular moment, there is no pattern, I am totally lost about why this happens. – 1a1a11a Mar 14 '16 at 04:41
  • Thanks for the paste. I completely understand if you're not keen on sharing your entire project with the internet but there's some vital stuff missing. An `__init__` method, classes, all the imports, etc. I also see some threading in the script; mixing multiprocessing and threading adds to the complexity. Anyway, it won't run and I suspect only you, the author, can make sense of it without prolonged debugging. [Here's a question](https://stackoverflow.com/questions/13535680/python-debug-tools-for-multiprocessing) dealing with the debugging of multiprocessing code that might come in handy. – jDo Mar 14 '16 at 11:38
  • Sorry, I didn't mean that, in order to run that, you still need 8 more other files and several other packages, that's a bunch of code to go through, I delete a lot of them only because I was hoping it could make it more clearer, not because I don't want to share. @jDo – 1a1a11a Mar 14 '16 at 13:49
  • @jDo here is a new simplified multiprocessing code that acts wield. http://stackoverflow.com/questions/35989545/multiprocessing-in-python-jump-over-process-with-no-error It is runnable, can you help? – 1a1a11a Mar 14 '16 at 14:00
  • @1a1a11a Sounds like you solved it? :) – jDo Mar 14 '16 at 17:35
  • I solved the simplified version, but not this one, I am still running test on it. Will post any finding I have here. Thanks! @jDo – 1a1a11a Mar 14 '16 at 19:22

0 Answers0