sandboxing/running python code line by line

Question

I'd love to be able to do something like these two are doing:

Inventing on principle @18:20 , Live ClojureScript Game Editor

If you don't wanna check the videos, my problem is this:

Say I had this code:

....
xs = []
for x in xrange(10):
    xs.append(x)
...

I'd like to make an environment where I can execute the code, statement for statement and watch/trace the locals/globals as they change. Maybe give it a list of vars to keep track of in the locals/globals dictionaries. Like stepping through the code and saving the state info.

Optimally I'd like to save every state and it's associated context-data (locals/globals) so I can verify predicates for instance.

I'd like to do something like Bret Victor's binarySearch example Inventing on principle @18:20

Am I making sense? I find it complicated to explain in text, but the videos showcase what I want to try :)

Thanks for your time

What I've tried/read/googled:

code.InteractiveConsole / code.InteractiveInterpreter
the livecoding module: seems to work for pure functional/stateless code
exec / eval magic: seems that I can't get as fine grained control as I'd like.
the trace module doesn't seem to be the way either.
Python eval(compile(...), sandbox), globals go in sandbox unless in def, why? <-- This is close to what I want, but it compiles the whole string/code block and runs it in one step. If I could run a file like this, but check the locals between every line/statement..
run python source code line by line <-- This is not what I want
How do Ruby and Python implement their interactive consoles? <-- This topic suggests that I look into the code module some more

My next step would be looking into ast and compiling the code and running it bit-by-bit, but I really need some guidance.. Should I look more into reflection and the inspect-module??

I've used the Spin model checker before, but it uses its own DSL and I'd just love to do the modelling in the implementation language, in this case python.

Oh and BTW I know about the security implications of sandboxing code, but I'm not trying to make a secure execution environment, I'm trying to make a very interactive environment, aiming for crude model checking or predicate assertion for instance.

I'm not sure. I would like to manipulate the locals/globals programmatically, so if I can do that from the debugger.. Have you checked the videos? *Inventing on principle* @ 18:20 and 1 minute forwards shows kinda what I want. — Morten Jensen, Mar 12 '12 at 16:35
I can't contribute much in the way of an answer, but this is a damn good question (+1) — inspectorG4dget, Mar 12 '12 at 16:42
At first I was like, 'Oh, hey, I know a good debugger I could recommend'. I watched the videos, and then I was like, 'Does something like that exist for python?'. That's a debugger on steroids. — yurisich, Mar 13 '12 at 13:05
@Droogans: I've done it in Scheme way back, but the Lisp-dialects are easier, cause the language is already in parsable form (S-expression) and it's easy to replicate the REPL. From a quick glance, the ClojureScript example looks like it's made this way actually :) — Morten Jensen, Mar 13 '12 at 13:11
Yeah, I was thinking more in line of the *Inventing on Principle* bit...I'd imagine the load times for larger scripts might get heavy... Interesting concept. — yurisich, Mar 13 '12 at 13:49
@Droogans: Indeed, reloading the code everytime it changes, could get heavy. You could set a hook to reload the code only when saving the file tho. But I'm not so interested in the auto-interpretation feature as much as the features in the binarySearch example. I'd LOVE to have an environment like that! :) — Morten Jensen, Mar 13 '12 at 13:58

score 8 · Accepted Answer · edited May 23 '17 at 12:10

After my initial success with sys.settrace(), I ended up switching to the ast module (abstract syntax trees). I parse the code I want to analyse and then insert new calls after each assignment to report on the variable name and its new value. I also insert calls to report on loop iterations and function calls. Then I execute the modified tree.

        tree = parse(source)

        visitor = TraceAssignments()
        new_tree = visitor.visit(tree)
        fix_missing_locations(new_tree)

        code = compile(new_tree, PSEUDO_FILENAME, 'exec')

        self.environment[CONTEXT_NAME] = builder
        exec code in self.environment

I'm working on a live coding tool like Bret Victor's, and you can see my working code on GitHub, and some examples of how it behaves in the test. You can also find links to a demo video, tutorial, and downloads from the project page.

+1 accepted. This is EXACTLY what I asked for :) Thanks for sharing this. I'm sorry I've been too busy to contribute to it. — Morten Jensen, Sep 07 '12 at 01:52

score 5 · Answer 2 · edited May 23 '17 at 12:02

Update: After my initial success with this technique, I switched to using the ast module as described in my other answer.

sys.settrace() seems to work really well. I took the hacks question you mentioned and Andrew Dalke's article and got this simple example working.

import sys

def dump_frame(frame, event, arg):
    print '%d: %s' % (frame.f_lineno, event)
    for k, v in frame.f_locals.iteritems():
        print '    %s = %r' % (k, v)
    return dump_frame

def main():
    c = 0
    for i in range(3):
        c += i

    print 'final c = %r' % c

sys.settrace(dump_frame)

main()

I had to solve two problems to get this working.

The trace function has to return itself or another trace function if you want to continue tracing.
Tracing only seems to begin after the first function call. I originally didn't have the main method, and just went directly into a loop.

Here's the output:

9: call
10: line
11: line
    c = 0
12: line
    i = 0
    c = 0
11: line
    i = 0
    c = 0
12: line
    i = 1
    c = 0
11: line
    i = 1
    c = 1
12: line
    i = 2
    c = 1
11: line
    i = 2
    c = 3
14: line
    i = 2
    c = 3
final c = 3
14: return
    i = 2
    c = 3
38: call
    item = <weakref at 0x7febb692e1b0; dead>
    selfref = <weakref at 0x17cc730; to 'WeakSet' at 0x17ce650>
38: call
    item = <weakref at 0x7febb692e100; dead>
    selfref = <weakref at 0x7febb692e0a8; to 'WeakSet' at 0x7febb6932910>

score 3 · Answer 3 · answered Mar 12 '12 at 16:50

3

It sounds like you need bdb, the python debugger library. It's built-in, and the docs are here: http://docs.python.org/library/bdb.html

It doesn't have all of the functionality you seem to want, but it's a sensible place to start implementing it.

answered Mar 12 '12 at 16:50

Marcin

48,559
18
128
201

Thanks dude (+1), I'll look into `bdb` :) I was really hoping there was something I could do with the `eval` / `exec` functions, or that I was missing something. – Morten Jensen Mar 12 '12 at 19:43

score 3 · Answer 4 · edited May 23 '17 at 11:47

Okay guys, I've made a bit progress.

Say we have a source file like this, we want to run statement by statement:

print("single line")
for i in xrange(3):
    print(i)
    print("BUG, executed outside for-scope, so only run once")
if i < 0:
    print("Should not get in here")
if i > 0:
    print("Should get in here though")

I want to execute it one statement at a time, while having access to the locals/globals. This is a quick dirty proof of concept (disregard the bugs and crudeness):

# returns matched text if found
def re_match(regex, text):
    m = regex.match(text)
    if m: return m.groups()[0]

# regex patterns
newline = "\n"
indent = "[ ]{4}"
line = "[\w \"\'().,=<>-]*[^:]"
block = "%s:%s%s%s" % (line, newline, indent, line)

indent_re = re.compile(r"^%s(%s)$" % (indent, line))
block_re = re.compile(r"^(%s)$" % block)
line_re =  re.compile(r"^(%s)$" % (line))

buf = ""
indent = False

# parse the source using the regex-patterns
for l in source.split(newline):
    buf += l + newline              # add the newline we removed by splitting

    m = re_match(indent_re, buf)    # is the line indented?
    if m: 
        indent = True               # yes it is
    else:
        if indent:                  # else, were we indented previously?
            indent = False          # okay, now we aren't

    m = re_match(block_re, buf)     # are we starting a block ?
    if m:
        indent = True
        exec(m)
        buf = ""
    else:
        if indent: buf = buf[4:]   # hack to remove indentation before exec'ing
        m = re_match(line_re, buf) # single line statement then?
        if m:
            exec(m) # execute the buffer, reset it and start parsing
            buf = ""
        # else no match! add a line more to the buffer and try again

Output:

morten@laptop /tmp $ python p.py
single line
0
1
2
BUG, executed outside for-scope, son only run once
Should get in here though

So this is somewhat what I want. This code breaks the source into executable statements and I'm able to "pause" in between statements and manipulate the environment. As the code above shows, I can't figure out how to properly break up the code and execute it again. This made me think that I should be able to use some tool to parse the code and run it like I want. Right now I'm thinking ast or pdb like you guys suggest.

A quick look suggests ast can do this, but it seems a bit complex so I'll have to dig into the docs. If pdb can control the flow programmatically, that may very well be the answer too.

Update:

Sooo, I did some more reading and I found this topic: What cool hacks can be done using sys.settrace?

I looked into using sys.settrace(), but it doesn't seem to be the way to go. I am getting more and more convinced I need to use the ast module to get as fine-gained control as I would like to. FWIW here's the code to use settrace() to peak inside function scope vars:

import sys

def trace_func(frame,event,arg):
    print "trace locals:"
    for l in frame.f_locals:
        print "\t%s = %s" % (l, frame.f_locals[l])

def dummy(ls):
    for l in ls: pass

sys.settrace(trace_func)
x = 5
dummy([1, 2, 3])
print "whatisthisidonteven-"

output:

morten@laptop /tmp $ python t.py 
trace locals:
    ls = [1, 2, 3]
whatisthisidonteven-
trace locals:
    item = <weakref at 0xb78289b4; dead>
    selfref = <weakref at 0xb783d02c; to 'WeakSet' at 0xb783a80c>
trace locals:
    item = <weakref at 0xb782889c; dead>
    selfref = <weakref at 0xb7828504; to 'WeakSet' at 0xb78268ac>

UPDATE:

Okay I seem to have solved it.. :) Ive written a simple parser that injects a statement between each line of code and then executes the code.. This statement is a function call that captures and saves the local environment in its current state.

I'm working on a Tkinter text editor with two windows that'll do what Bret Victor does in his binarySearch-demo. I'm almost done :)

Yeah totally git that and reply here. Id like to try it out for you. — yurisich, Mar 15 '12 at 01:39
I still need to do some work on it, but I think I'll finish it up in the weekend :) I'll hit you back whenever I got something running — Morten Jensen, Mar 16 '12 at 12:17
I'm working on the same thing, but I haven't gotten as far. I did a [rough proof of concept](https://github.com/donkirkby/live-py) that just executes an entire block of code every second. Now I'm trying to build an Eclipse plugin that will add live coding to PyDev. I just figured out how to [add an extra ruler](https://github.com/donkirkby/live-py-plugin) to an Eclipse editor. Let me know if you want to work together. — Don Kirkby, May 16 '12 at 05:51
I've got local variable assignments and looping appearing in the ruler. [Check it out](https://github.com/donkirkby/live-py-plugin), @Droogans and Morten. — Don Kirkby, May 20 '12 at 05:18

score 2 · Answer 5 · answered Mar 13 '12 at 12:31

2

For simple tracing I suggest you use pdb. I've found it's quite reasonable for most debugging/single stepping purposes. For your example:

import pdb
...
xs = []
pdb.set_trace()
for x in xrange(10):
    xs.append(x)

Now your program will stop at the set_trace() call and you can use n or s to step through your code while it's executing. AFAIK pdb is using bdb as its backend.

answered Mar 13 '12 at 12:31

hochl

12,524
10
53
87

I read a bit about `pdb` yesterday, but it seemed like I was supposed to do the flow-control manually. Am I correct about that? I'd like to be able to control the flow programmatically. Just stepping through the code one statement at a time while giving me the possibility to do something in between statements. – Morten Jensen Mar 13 '12 at 12:37
Well as with any interactive debugger you can set breakpoints, examine variables or modify them. – hochl Mar 13 '12 at 12:50
Puh -- I've never done that I must agree ... just used it for the average debugging task. – hochl Mar 13 '12 at 13:01

score 2 · Answer 6 · answered Mar 15 '12 at 00:17

I see you've come up with something that works for you, but thought it would be worth mentioning 'pyscripter'. http://code.google.com/p/pyscripter/

I'm pretty new to python, but I'm finding it very useful to
simply click past the line that has a variable I want to check,
then press f4 to run it in a debugger mode.
After that I can just hover the mouse over the variable and it pops up
a tooltip that has the variable's values.

You can also single step through the script with f7 as described here:
http://openbookproject.net/thinkcs/python/english3e/functions.html#flow-of-execution
(see 'Watch the flow of execution in action')

Although when I followed the example it still stepped into the turtle module for some reason.

Cool, I'll definitely look into that :) – Morten Jensen Mar 15 '12 at 00:44 — Morten Jensen, Mar 15 '12 at 00:44

score 0 · Answer 7 · answered Mar 12 '12 at 16:29

0

download eclipse+pydev and run it in debug mode...

answered Mar 12 '12 at 16:29

WeaselFox

7,220
8
44
75

that won't quite be powerful enough I think. I want to do more than just watch the program state. I would like to build a lattice of all possible program states, run through all of them and asserting a predicate at each state for example. `for state in states: assert state.locals["var"] >= 0` or something – Morten Jensen Mar 12 '12 at 16:37
If you mean you haven't seen that in general, check out [the Spin model checker](http://spinroot.com/spin/whatispin.html). Limited DSL with concurrency and IPC that can verify predicates on the code. You build a model of your system and prove stuff about it. – Morten Jensen Mar 13 '12 at 02:03

sandboxing/running python code line by line

7 Answers7

Linked