8

While playing around with compile(), the marshal module, and exec. I've encountered some confusing behavior. Consider simple.py

def foo():
    print "Inside foo()..."

def main():
    print "This is a simple script that should count to 3."

    for i in range(1, 4):
        print "This is iteration number", i

    foo()

if __name__ == "__main__":
    main()

When I run this script using exec like this

with open('simple.py', 'r') as f:
    code = f.read()
exec code

it gives the expected output.

This is a simple script that should count to 3.
This is iteration number 1
This is iteration number 2
This is iteration number 3
Inside foo()...

However, when if I introduce compile(), marshal.dump(), and marshal.load() like this

import marshal

def runme(file):
    with open(file, "r") as f:
        code = marshal.load(f)
    exec code

with open("simple.py", "r") as f:
    contents = f.read()

code = compile(contents, "simple.py", "exec")
with open("marshalled", "w") as f:  
    marshal.dump(code, f)

runme("marshalled")

it prints the beginning of the expected output and then errors out

This is a simple script that should count to 3.
This is iteration number 1
This is iteration number 2
This is iteration number 3
Traceback (most recent call last):
  File "./exec_within_function.py", line 17, in <module>
    runme("marshalled")
  File "./exec_within_function.py", line 8, in runme
    exec code
  File "simple.py", line 15, in <module>
    main()
  File "simple.py", line 12, in main
    foo()
NameError: global name 'foo' is not defined

Why does it say that foo is not defined?

In order to understand, I tried using dir() like this

import simple # imports simple.py
dir(simple)

and as expected, it shows that foo is defined.

['__builtins__', '__doc__', '__file__', '__name__', '__package__', 'foo', 'main']

I've also noticed that when I use dis.dis() on the deserialized code object (read via marshal.load()), the only thing I see is the LOAD_NAME and CALL_FUNCTION for main(), but when I do it with import like this

import dis, sys

import simple
dis.dis(sys.modules["simple"])

it gives me the entire disassembly as expected.

I've even looked at some of the code that python uses for compiling and although I think import uses some sort of lookup table for definitions, I'm not sure what the difference is with compile() that's causing this behavior.

timotree
  • 1,325
  • 9
  • 29
user1633448
  • 108
  • 7
  • What do you mean? The following seems to work: with open('simple.py', 'r') as f: code = f.read() with open('simple.mash', 'w') as f: marshal.dump(code, f) with open('simple.mash', 'r') as f: code = marshal.load(f) exec code – Dhara Aug 29 '12 at 15:19
  • Before you `marshal.dump()` the code to simple.mash, try running `compile()` on it, then `marshal.dump()` the resulting code object. – user1633448 Aug 29 '12 at 15:24

2 Answers2

0

This script successfully exercises your simple.py code 3 times. Does this clarify anything? Or am I misunderstanding your question?

# from original example
with open('simple.py', 'r') as f:
    code = f.read()
exec(code)
# compile and run again
a = compile(code, "simple_compiled_this_file_not_created", "exec")
exec(a)
# marshal and unmarshal
import marshal
f = open("./marshalfoo.bin", "wb")
marshal.dump(a,f) 
f.close()
b = marshal.load(open("./marshalfoo.bin", "rb"))
exec(b)
DrSkippy
  • 390
  • 1
  • 3
  • 1
    Interesting. So I was able to track down my issue based on your code. It would appear that if I try to `exec()` the code from inside a function, it fails (eg after being run through `compile()` I call a function I created called `runme()` that actually calls `exec()` on the resulting code object). What's even more interesting is if I call `exec()` from the same scope as `compile()` and then call `runme()` is that it works fine. So its for sure a scoping issue. – user1633448 Aug 30 '12 at 13:44
  • 1
    For anyone who cares, I got this to work as expected by placing the code inside simple.py inside a custom class (I called it Simple), and then in the `if __name__ == "__main__"` doing `s= Simple()` and `s.main()`. After that the code appears to work fine from any scope. – user1633448 Sep 05 '12 at 13:06
0

Why does it say that foo is not defined?

This much smaller example gives the same error

with open("simple.py", "r") as f:
    code = f.read()

def wrap_exec(code):
    exec code

wrap_exec(code)

but this one doesn't.

with open("simple.py", "r") as f:
    code = f.read()

exec code

If you hadn't already guessed, the problem occurs when you call exec from within a function.

To find the best explanations and solutions for that, look at the answers to Why exec() works differently when invoked inside of function and how to avoid it. For completeness, below is how I recommend you solve it in this case.

Since you have access to changing the execed code (simple.py in this example), the problem can easily be solved by adding a global declaration.

global foo # added

def foo():
    print "Inside foo()..."

def main():
    print "This is a simple script that should count to 3."

    for i in range(1, 4):
        print "This is iteration number", i

    foo()

if __name__ == "__main__":
    main()

With regards to why dir(simple) still shows foo, it's actually because you imported simple.py rather than execing its contents. Not only does foo appear in the output of dir(), but the program works when you use import.

import simple
simple.main()

If this surprises you, it's because when you import something, Python treats it as a module. Within a module things declared at the top level are automatically made global.

With regards to the confusing outputs of dis.dis, I couldn't reproduce that behavior so I can't study it and provide an explanation.

timotree
  • 1,325
  • 9
  • 29