11

I have a rather unusual request, I think... I'll explain the why after I explain the what.

What

I want to detect whenever my object is written to stdout, so that I can perform side effects at that time. So, for example, when I type:

sys.stdout.write(instance_of_my_class)

it should perform side effects. I've made my class be a subclass of str and overrode __call__, __unicode__, __str__, __repr__, index, decode, encode, format, __format__, __getattribute__,__getitem__, and __len__ so that each of them prints a statement indicating that they've been called, but it seems that sys.stdout.write calls none of those in order to print an object.

Note that I'm specifically talking about sys.stdout.write and not, for example, print - I have found that print calls __str__ on whatever it is given.

Why

This question continues from where the answer to Colored Python Prompt in Windows? left off.

I have found that each time python needs to display an interactive prompt, it calls __str__ on sys.ps1 and sys.ps2, and then it saves the results to be displayed on the command line. This means any side effects in sys.ps2.__str__ are caused right after the ones in sys.ps1.__str__, but I want to have those wait until it's time to display sys.ps2.

So rather than return a str in sys.ps2.__str__, I've been returning my subclass of str, which I'm hoping will somehow be capable of catching when sys.stdout.write is called on it.

Community
  • 1
  • 1
ArtOfWarfare
  • 20,617
  • 19
  • 137
  • 193

2 Answers2

4

Intriguing problem! My first guess is that sys.stdout.write doesn't call the __str__ method because your object already is a str (or at least a subclass of it, which is good enough for all intents and purposes)... so no casting methods are needed.

Further investigation suggests that sys.stdout.write really doesn't ever want to call the __str__ method ...

Subclass approach

With a little introspection, you can find out which methods of your str subclass are called by sys.stdout.write (the answer is, not many):

class superstring(str):
    def __getattribute__(self, name):
        print "*** lookup attribute %s of %s" % (name, repr(self))
        return str.__getattribute__(self, name)

foo = superstring("UberL33tPrompt> ")
sys.stdout.write(foo)

Running in a Unicode environment (Python 2.7, iPython notebook), this prints:

*** lookup attribute __class__ of 'UberL33tPrompt> '
*** lookup attribute decode of 'UberL33tPrompt> '
UberL33tPrompt> 

It seems rather kludge-y, but you could override the subclass's decode method to perform the desired side effects.

However, in a non-Unicode environment there are no attribute lookups.

Wrapper approach

Rather than using a subclass of str, maybe what you need is some kind of "wrapper" around str. Here's an ugly exploratory hack which creates a class that delegates most of its attributes to str, but which is not strictly a subclass thereof:

class definitely_not_a_string(object):
    def __init__(self, s):
        self.s = s
    def __str__(self):
        print "*** Someone wants to see my underlying string object!"
        return self.s
    def decode(self, encoding, whatever):
        print "*** Someone wants to decode me!"
        return self.s.decode(encoding, whatever)
    def __getattribute__(self, name):
        print "*** lookup attribute %s of %s" % (name, repr(self))
        if name in ('s', '__init__', '__str__', 'decode', '__class__'):
            return object.__getattribute__(self, name)
        else:
            return str.__getattribute__(self, name)

foo = definitely_not_a_string("UberL33tPrompt> ")
sys.stdout.write(foo)

In the Unicode environment, this gives basically the same results:

*** lookup attribute __class__ of <__main__.definitely_not_a_string object at 0x00000000072D79B0>
*** lookup attribute decode of <__main__.definitely_not_a_string object at 0x00000000072D79B0>
*** Someone wants to decode me!
*** lookup attribute s of <__main__.definitely_not_a_string object at 0x00000000072D79B0>
UberL33tPrompt> 

However, when I run in a non-Unicode environment, definitely_not_a_string gives an error message:

TypeError: expected a character buffer object

... this shows that the .write method is going straight to the C-level buffer interface when it doesn't need to do any Unicode decoding.

My conclusion

It seems that overriding the decode method is a possible kludge in Unicode environments, since sys.stdout.write calls this method when it needs to decode a str into Unicode.

However, in non-Unicode environments it appears that .write doesn't do any attribute lookups whatsoever, but simply goes straight to the C-level character buffer protocol, so there's no way to intercept its access from Python code. Indeed, help(sys.stdout.write) verifies that it's a built-in function (aka written in C, not Python).

Dan Lenski
  • 76,929
  • 13
  • 76
  • 124
  • Unfortunately, I'm working in a non-Unicode environment. You mention that it's not possible to catch from Python code because it's C code. Would it be possible to capture it from C code, somehow? – ArtOfWarfare Jun 19 '14 at 20:48
  • 1
    Well, you *could* write a custom `str` subclass in C which somehow hooked into the buffer protocol in order to notify your Python code to do something special. This is starting to sound like an anti-pattern though: you're trying to "trick" `sys.stdout.write` into seeing a different view of your prompt string than all other code. Why not just modify or replace the code that runs `sys.stdout.write` in the first place? – Dan Lenski Jun 19 '14 at 21:34
  • The code that runs `sys.stdout.write` is the python interactive interpreter - the portion that prints the prompt (normally `>>>` for ps1 and `...` for ps2). I looked into the possibility of modifying it, but I couldn't even find the relevant file in the Python source distribution. At this point, I'm more interested in hearing if there's any solution than actually implementing one, I think - this is way too much trouble to just to change the `...` in cmd to be green instead of white - it already works completely on every *nix system and the `>>>` is green even on Windows. – ArtOfWarfare Jun 19 '14 at 23:11
  • Have you tried using IPython or one of the other front-ends that replaces the standard REPL interactive interface with a nicer one? IPython works great under Windows... – Dan Lenski Jun 19 '14 at 23:22
  • Hm, I haven't. I have a lot of VMs that I have to manage and I frequently pull up Python on them - I like syncing a single PYTHONSTARTUP file between them all via Dropbox so that I have a consistent experience regardless of platform/VM instance. – ArtOfWarfare Jun 19 '14 at 23:46
2

Why not monkeypatch stdout.write?

stdoutRegistry = set()

class A(object):
    def __init__(self):
        self.stdoutRegistry.add(self)

    def stdoutNotify(self):
        pass

original_stdoutWrite = sys.stdout.write
def stdoutWrite(*a, **kw):
    if a in stdoutRegistry:
        a.stdoutNotify()
    original_stdoutWrite(*a, **kw)
sys.stdout.write = stdoutWrite
Ben
  • 2,422
  • 2
  • 16
  • 23
  • That's an interesting idea. I wonder if it would really do what I want... there's a lot of really cool stuff to easily modify the behavior of the standard REPL if that would work. I'll look into it later this evening and let you know if it works or not (and I'll accept your answer if it does work). – ArtOfWarfare Jun 30 '14 at 15:03
  • +1 Ben: This works to answer the question that I asked, but unfortunately it doesn't actually detect when sys.ps2 is printed. – ArtOfWarfare Jun 30 '14 at 21:24
  • You can just check for `if a is sys.ps2` – Ben Jul 01 '14 at 15:29
  • You misunderstand. `sys.ps2` (normally `...`) is the message that the Python interactive interpreter prints out to designate that you should continue typing your input (IE, because the prior line ended in `:` or was indented). I want to detect when the interpreter prints that prompt, but it would appear that it doesn't utilize `sys.stdout`'s write method. I've tried reading through the Python source files to find out how it does print those messages, but I couldn't find it. – ArtOfWarfare Jul 01 '14 at 22:15
  • After having spent the past 2 hours digging through Python's source, I finally found the line where it prompts for input. In `tokenizer.c` there's a function that gets called, `PyOS_ReadLine()` which is passed a prompt as a C-string. The issue is that Python doesn't cache the string values of sys.ps1 and sys.ps2 in a Python structure as I had thought, but in a C structure, so I'm not thinking it'll be possible to hijack it the way I wanted without creating my own build of the Python executable (too much work for too little benefit). – ArtOfWarfare Jul 02 '14 at 00:16