0

I have a python script with the following, more or less, code:

def some_function():
  pass

class SomeClass:
  def __init__(self):
    self.pool = mp.Pool(10)
  def do_smth(self):
    self.pool.map(some_function, range(10))


if __name__ == '__main__':
  cls = SomeClass()
  for _ in range(1000):
    print("*")
    cls.do_smth()

the jobs are obviously much more heavy than this, however at some point it just get stuck, in a sense that no error is reported, the terminal signals that the script is still running, but no more "*" are printed, and the CPU manager of my PC reports 3% of CPU usage, so seems like that it just "crashed" without saying nothing to nobody.

For the moment, I this it might be a memory issue (however, during the time it works, RAM stays at 70%), but I have no idea... do you have any idea?

I'm working on a Macbook pro M1 max with 24 GPUs and 32GB of RAM

Alberto Sinigaglia
  • 12,097
  • 2
  • 20
  • 48
  • What python version? – tzaman Dec 23 '22 at 17:17
  • 1
    You'll need to provide code that reproduces the problem for us. This code is not runnable. If we change *some_function()* to *some_function(n)* then this runs without error in Python 3.11.1 on Xeon. Of course that doesn't mean that there's not a problem in your out-of-date version. Can't see why it could be M1-specific – DarkKnight Dec 23 '22 at 17:37
  • @Fred Fair point, If you want I can point to the repository (you will need to install tensorflow to run the code), but it's a pretty basic script – Alberto Sinigaglia Dec 23 '22 at 18:25
  • seems pretty close to https://stackoverflow.com/questions/65115092/occasional-deadlock-in-multiprocessing-pool – Alberto Sinigaglia Dec 23 '22 at 18:47

1 Answers1

0

You may need to change the start_method for new processes:

mp.set_start_method('spawn')

This is the default for macOS on Python 3.8+ but not before, and it was changed due to apparent crashes using fork as reported in this issue. Apart from that if there's any thread level locking happening it could lead to deadlocks in fork mode.

tzaman
  • 46,925
  • 11
  • 90
  • 115
  • `python --version -> Python 3.9.15`... and with `python -i` i see this: `multiprocessing.get_start_method() #'spawn'`... I'll still add it to be more explicit and see if makes any difference – Alberto Sinigaglia Dec 23 '22 at 17:29
  • Absent that there's no way to magically debug why your code is getting stuck without a reproducible example. – tzaman Dec 23 '22 at 17:48
  • Fair point, If you want I can point to the repository (you will need to install tensorflow to run the code), but it's a pretty basic script – Alberto Sinigaglia Dec 23 '22 at 18:25
  • seems pretty close to https://stackoverflow.com/questions/65115092/occasional-deadlock-in-multiprocessing-pool – Alberto Sinigaglia Dec 23 '22 at 18:47