Correctly replace a function's code object

Question

I am attempting to take the source code of a function, add code to it, and then put it back in the original function.

Basically like so:

new_code = change_code(original_code)
throwaway_module = ModuleType('m')
exec(new_code, throwaway_module.__dict__)
func.__code__ = getattr(throwaway_module, func.__name__).__code__

This works perfectly when new_code doesn't contain any name which wasn't in the original function.

However when new_code contains a variable name which wasn't there in the original func, then on the last line I get the following error:

ValueError: func() requires a code object with 1 free vars, not 0

Any ideas?

EDIT:

It seems I have found where in the CPython source code this exception is raised (file funcobject.c). Omitted some lines for clarity:

static int
func_set_code(PyFunctionObject *op, PyObject *value, void *Py_UNUSED(ignored))
{
    Py_ssize_t nfree, nclosure;

    // ... lines omitted

    nfree = PyCode_GetNumFree((PyCodeObject *)value);
    nclosure = (op->func_closure == NULL ? 0 :
            PyTuple_GET_SIZE(op->func_closure));
    if (nclosure != nfree) {
        PyErr_Format(PyExc_ValueError,
                     "%U() requires a code object with %zd free vars,"
                     " not %zd",
                     op->func_name,
                     nclosure, nfree);
        return -1;
    }
    Py_INCREF(value);
    Py_XSETREF(op->func_code, value);
    return 0;
}

Does this help you help me? :)

score 4 · Answer 1 · answered Feb 12 '19 at 12:43

This exception is due to attempting to assign a code object to a function which closes over a different number of variables than the function it came from did. If that sentence sounded like gibberish then you should take a look at this answer.

The easiest way to avoid this problem is to simply reassign the existing name in the obvious way, ie f = g instead of f.__code__ = g.__code__. By doing it this way the code object always stays with its matching closure (more on that later). In your case, that would look like func = getattr(throwaway_module, func.__name__). Is there some reason you can't do this and were mucking around with internal implementation details instead?

In order to better illustrate what's happening here, suppose we have some silly functions.

def dog():
    return "woof"

def cat():
    return "meow"

def do_stuff(seq):
    t1 = sum(seq)
    seq2 = [e + t1 for e in seq]
    t2 = sum(seq2)
    return t1 + t2

def pair(animal):
    def ret():
        return animal() + animal()
    return ret

cats = pair(cat)

print(dog()) # woof
print(cat()) # meow
print(cats()) # meowmeow
print(do_stuff([1,2,3])) # 30

Even though do_stuff has a different number of local variables than dog, we can still successfully reassign code objects between them.

do_stuff.__code__ = dog.__code__
print(do_stuff()) # woof

However, we can't reassign between cats and dog because cats closes over the argument animal.

print(cats.__code__.co_freevars) # ('animal',)
dog.__code__ = cats.__code__

ValueError: dog() requires a code object with 0 free vars, not 1

This problem can be avoided by simply reassigning the name to the desired function object.

dog = cats
print(dog()) # meowmeow

In fact, if you were to successfully pull off a code object reassignment for a function with a closure, things would most likely not go as expected if the function were executed. This is because the closed over variables are saved separately from the compiled code, so they wouldn't match.

def get_sum_func(numbers):
    def ret():
        return sum(numbers)
    return ret

sum_func = get_sum_func([2,2,2]) # sum_func closes over the provided arg

# swap code objects
# quite possibly the most disturbing single line of python I've ever written
sum_func.__code__, cats.__code__ = (cats.__code__, sum_func.__code__)

print(sum_func()) # this will attempt to execute numbers() + numbers(), which will throw
print(cats()) # this will attempt to execute sum(animal), which will throw

As it turns out, we can't easily replace the __closure__ attribute because it is read-only. You could presumably work around it if you were really determined, but that's almost certainly a terrible idea.

# swap closures
# this results in "AttributeError: readonly attribute"
sum_func.__closure__, cats.__closure__ = (cats.__closure__, sum_func.__closure__)

For more details about function object attributes, see this answer and the docs.

Correctly replace a function's code object

1 Answers1