Python debugger with line edition in a program that uses stdin

Question

To add an ad hoc debugger breakpoint in a Python script, I can insert the line

import pdb; pdb.set_trace()

Pdb reads from standard input, so this doesn't work if the script itself also reads from standard input. As a workaround, on a Unix-like system, I can tell pdb to read from the terminal:

import pdb; pdb.Pdb(stdin=open('/dev/tty', 'r'), stdout=open('/dev/tty', 'w')).set_trace()

This works, but unlike with a plain pdb.set_trace, I don't get the benefit of command line edition provided by the readline library (arrow keys, etc.).

How can I enter pdb without interfering with the script's stdin and stdout, and still get command line edition?

Ideally the same code should work in both Python 2 and Python 3. Compatibility with non-Unix systems would be a bonus.

Toy program as a test case:

#!/usr/bin/env python
import sys
for line in sys.stdin:
    #import pdb; pdb.set_trace()
    import pdb; pdb.Pdb(stdin=open('/dev/tty', 'r'), stdout=open('/dev/tty', 'w')).set_trace()
    sys.stdout.write(line)

Usage: { echo one; echo two; } | python cat.py

Ondrej K. · Answer 1 · 2019-01-04T20:24:47.030

I hope I have not missed anything important, but it seems like you cannot really do that in an entirely trivial way, because readline would only get used if pdb.Pdb (resp. cmd.Cmd it sublcasses) has use_rawinput set to non-zero, which however would result in ignoring your stdin and mixing inputs for debugger and script itself. That said, the best I've come up with so far is:

#!/usr/bin/env python3
import os
import sys
import pdb

pdb_inst = pdb.Pdb()

stdin_called = os.fdopen(os.dup(0))
console_new = open('/dev/tty')
os.dup2(console_new.fileno(), 0)
console_new.close()
sys.stdin = os.fdopen(0)

for line in stdin_called:
    pdb_inst.set_trace()
    sys.stdout.write(line)

It is relatively invasive to your original script, even though it could be at least placed outside of it and imported and called or used as a wrapper.

I've redirected (duplicated) the incoming STDIN to a file descriptor and opened that as stdin_called. Then (based on your example) I've opened /dev/tty for reading, replaced process' file descriptor 0 (for STDIN; it should rather use value returned by sys.stdin.fileno()) with this one I've just opened and also reassigned a corresponding file-like object to sys.stdin. This way the programs loop and pdb are using their own input streams while pdb gets to interact with what appears to be just a "normal" console STDIN it is happy to enable readline on.

It isn't pretty, but should be doing what you were after and it hopefully provides useful hints. It uses (if available) readline (line editing, history, completion) when in pdb:

$ { echo one; echo two; } | python3 cat.py
> /tmp/so/cat.py(16)<module>()
-> sys.stdout.write(line)
(Pdb) c
one
> /tmp/so/cat.py(15)<module>()
-> pdb_inst.set_trace()
(Pdb) con[TAB][TAB]
condition  cont       continue   
(Pdb) cont
two

Note starting with version 3.7 you could use breakpoint() instead of import pdb; pdb.Pdb().set_trace() for convenience and you could also check result of dup2 call to make sure the file descriptor got created/replaced as expected.

EDIT: As mentioned earlier and noted in a comment by OP, this is both ugly and invasive to the script. It's not making it any prettier, but we can employ few tricks to reduce impact on its surrounding. One such option I've hacked together:

import sys

# Add this: BEGIN
import os
import pdb
import inspect

pdb_inst = pdb.Pdb()

class WrapSys:
    def __init__(self):
        self.__stdin = os.fdopen(os.dup(0))
        self.__console = open('/dev/tty')
        os.dup2(self.__console.fileno(), 0)
        self.__console.close()
        self.__console = os.fdopen(0)
        self.__sys = sys

    def __getattr__(self, name):
        if name == 'stdin':
            if any((f.filename.endswith("pdb.py") for f in inspect.stack())):
                return self.__console
            else:
                return self.__stdin
        else:
            return getattr(self.__sys, name)

sys = WrapSys()
# Add this: END

for line in sys.stdin:
    pdb_inst.set_trace()  # Inject breakpoint
    sys.stdout.write(line)

I have not dug all the way through, but as is, pdb/cmd really seems to not only need sys.stdin but also for it to use fd 0 in order for readline to kick in. Above example takes things up a notch and within our script hijacks what sys stands for in order to preset different meaning for sys.stdin when code from pdb.py is on a stack. One obvious caveat. If anything else other then pdb also expects and depends on sys.stdin fd to be 0, it still would be out of luck (or reading its input from a different stream if it just went for it).

I'm afraid that going through the script and replacing all usage of stdin by something else is a no-go. Just like it seems that `cmd` and `raw_input` hard-code the use of stdin, the script may be using other libraries that hard-code stdin. — Gilles 'SO- stop being evil', Jan 03 '19 at 07:45
@Gilles: Yeah, I agree it's quite ugly. I suspect it might be a bit better to go the other way around and dig through `pdb`/`cmd`/`readline` and perhaps subclass. I was also considering minimal code required to get the desired result. Anyway, I've extended the example a bit and perhaps with that (and placed inside of module (except for `sys` reassignment) it may get be more helpful for your situation? — Ondrej K., Jan 04 '19 at 20:28

Python debugger with line edition in a program that uses stdin

1 Answers1