10

I'm trying to debug a simple python application but no luck so far.

import multiprocessing

def worker(num):
    for a in range(0, 10):
        print a

if __name__ == '__main__':
    for i in range(5):
        p = multiprocessing.Process(target=worker, args=(i,))
        p.start()

I want to set a breakpoint inside the for-loop to track the values of 'a' but non of the tools that I tried are able to do that. So far I tried debuging with:

  • PyCharm and get the following error: ImportError: No module named pydevd - http://youtrack.jetbrains.com/issue/PY-6649 It looks like they are still working on a fix for this and from what I understand, no ETA on this
  • I also tried debuging with Winpdb - http://winpdb.org but it simply won't go inside my 'worker' method and just print the values of 'a'

I would really appreciate any help with this!

Mikael Engver
  • 4,634
  • 4
  • 46
  • 53
barmaley
  • 1,027
  • 2
  • 11
  • 12
  • 3
    When it comes to multiprocessing/multithreading, there's no such thing as "simple". In my opinion, at least. – JAB Jun 19 '12 at 18:31
  • 1
    That is Windows-specific bug in PyCharm debugger.If you really need to debug application using multiprocessing module, I can recommend to use Unix virtual machine and setup remote interpreter to that VM from your PyCharm. – Dmitry Trofimov Jun 25 '12 at 17:53

7 Answers7

7

I found it very useful to replace multiprocessing.Process() with threading.Thread() when I'm going to set breakpoints. Both classes have similar arguments so in most cases they are interchangeable.

Usually my scripts use Process() until I specify command line argument --debug which effectively replaces those calls with Thread(). That allows me to debug those scripts with pdb.

Maksym Ganenko
  • 1,288
  • 15
  • 11
6

You should be able to do it with remote-pdb.

from multiprocessing import Pool

def test(thing):
  from remote_pdb import set_trace
  set_trace()
  s = thing*2
  print(s)
  return s

if __name__ == '__main__':
  with Pool(5) as p:
    print(p.map(test,['dog','cat','bird']))

Then just telnet to the port that's listed in the log.

Example:

RemotePdb session open at 127.0.0.1:54273, waiting for connection ...
telnet 127.0.0.1 54273
<telnet junk>
-> s = thing*2
(Pdb) 

or

nc -tC 127.0.0.1 54273

-> s = thing * 2
(Pdb)

You should be able to debug the process at that point.

Oreo
  • 529
  • 3
  • 16
OnionKnight
  • 61
  • 1
  • 3
2

I copied everything in /Applications/PyCharm\ 2.6\ EAP.app/helpers/pydev/*.py to site-packages in my virtualenv and it worked for my (I'm debugging celery/kombu, breakpoints work as expected).

cerberos
  • 7,705
  • 5
  • 41
  • 43
2

It would be great if regular pdb/ipdb would work with multiprocessing. If I can get away with it, I handle calls to multiprocessing serially if the number of configured processes is 1.

if processes == 1:
    for record in data:
        worker_function(data)
else:
    pool.map(worker_function, data)

Then when debugging, configure the application to only use a single process. This doesn't cover all cases, especially when dealing with concurrency issues, but it might help.

Joseph Sheedy
  • 6,296
  • 4
  • 30
  • 31
1

I've rarely needed to use a traditional debugger when attempting to debug Python code, preferring instead to liberally sprinkle my code with trace statements. I'd change your code to the following:

import multiprocessing
import logging

def worker(num):
    for a in range(0, 10):
        logging.debug("(%d, %d)" % (num, a))

if __name__ == '__main__':
    logging.basicConfig(level=logging.DEBUG)
    for i in range(5):
        p = multiprocessing.Process(target=worker, args=(i,))
        logging.info("Starting process %d" % i)
        p.start()

In production, you disable the debug trace statements by setting the trace level to logging.WARNING so you only log warnings and errors.

There's a good basic and advanced logging tutorial on the official Python site.

CadentOrange
  • 3,263
  • 1
  • 34
  • 52
  • 1
    I truly appreciate your quick response but unfortunately this is not what I'm looking for. What I should have mentioned in my question, is that this is just a simple example of the functionality that I need but in reality I'm dealing with complex objects and I need breakpoints to see their content at a certain time. Simple printing the content would not be enough. – barmaley Jun 19 '12 at 18:21
  • I don't know of any python debugger that is as capable as the Visual Studio debugger. Perhaps you could modify the logging technique to only log under certain conditions, effectively simulating a conditional breakpoint? – CadentOrange Jun 19 '12 at 18:55
  • 1
    logging is great, but it isn't free even when the logging level is set above the logging statement level since function calls have a significant cost in a dynamic language like Python. It's good to know a debugger, and pdb ships with Python. Then it's just a problem of setting your multiprocessing project up correctly. – Joseph Sheedy Dec 14 '15 at 23:22
  • In some instances, you have to force trace statements to be printed because of buffering. – Nielsvh Apr 30 '19 at 16:42
  • @Nielsvh how do I do that? – cymruu Jul 30 '19 at 15:09
  • 1
    @cymruu See https://stackoverflow.com/a/230774/892327 and https://stackoverflow.com/a/230780/892327. Buffering is usually good for performance, so disabling it is probably not the best idea. – Nielsvh Jul 30 '19 at 21:27
  • I've experienced the process hanging during logging resulting from using loggers like this with multiprocessing. It regularly worked Perhaps this is more related to the `RotatingFileHandler` than the logger – Apeiron Jun 28 '21 at 15:09
0

If you are trying to debug multiple processes running simultaneously, as shown in your example, then there's no obvious way to do that from a single terminal: which process should get the keyboard input? Because of this, Python always connects sys.stdin in the child process to os.devnull. But this means that when the debugger tries to get input from stdin, it immediately reaches end-of-file and reports an error.

If you can limit yourself to one subprocess at a time, at least for debugging, then you could get around this by setting sys.stdin = open(0) to reopen the main stdin, as described here.

But if multiple subprocesses may be at breakpoints simultaneously, then you will need a different solution, since they would all end up fighting over input from the single terminal. In that case, RemotePdb is probably your best bet, as described by @OnionKnight.

Matthias Fripp
  • 17,670
  • 5
  • 28
  • 45
-1

WingIDE Pro provides this functionality right out-of-the-box.

No additional code (e.g., use of the traceback module) is needed. You just run your program, and the Wing debugger will not only print stdout from subprocesses, but it will break on errors in a subprocess and instantly create and an interactive shell so you can debug the offending thread. It doesn't get any easier than this, and I know of no other IDE that exposes subprocesses in this way.

Yes, it's a commercial product. But I have yet to find any other IDE that provides a debugger to match. PyCharm Professional, Visual Studio Community, Komodo IDE - I've tried them all. WingIDE also leads in parsing source documentation as well, in my opinion. And the Eye Ease Green color scheme is something I can't live without now.

(Yes, I realize this question is 5+ years old. I'm answering it anyway.)

Phil
  • 325
  • 1
  • 3
  • 6