11

I'm developing a documentation testing framework -- basically unit tests for PDFs. Tests are (decorated) methods of instances of classes defined by the framework, and these are located and instantiated at runtime and the methods are invoked to execute the tests.

My goal is to cut down on the amount of quirky Python syntax that the people who will write tests need to be concerned about, as these people may or may not be Python programmers, or even very much programmers at all. So I would like them to be able to write "def foo():" instead of "def foo(self):" for methods, but still be able to use "self" to access members.

In an ordinary program I would consider this a horrible idea, but in a domain-specific-languagey kind of program like this one, it seems worth a try.

I have successfully eliminated the self from the method signature by using a decorator (actually, since I am using a decorator already for the test cases, I would just roll it into that), but "self" does not then refer to anything in the test case method.

I have considered using a global for self, and even come up with an implementation that more or less works, but I'd rather pollute the smallest namespace possible, which is why I would prefer to inject the variable directly into the test case method's local namespace. Any thoughts?

kindall
  • 178,883
  • 35
  • 278
  • 309
  • 11
    Just teach them that `def foo(self):` is part of the boilerplate that needs to go on every function. Don't focus on the why, just emphasize that it MUST be there, and you'll probably be fine. – derekerdmann Aug 10 '10 at 22:36
  • You are probably right, but I'm still interested to see what people come up with! – kindall Aug 10 '10 at 22:46
  • 1
    Why don't you just make your classes modules and your methods functions in the module? Less boilerplate and stuff to do. `py.test` does the very effectively. – Noufal Ibrahim Dec 23 '10 at 17:24

5 Answers5

6

My accepted answer to this question was pretty dumb but I was just starting out. Here's a much better way. This is only scantily tested but it's good for a demonstration of the proper way to do this thing which is improper to do. It works on 2.6.5 for sure. I haven't tested any other versions but no opcodes are hardcoded into it so it should be about as portable as most other 2.x code.

add_self can be applied as a decorator but that would defeat the purpose (why not just type 'self'?) It would be easy to adapt the metaclass from my other answer to apply this function instead.

import opcode
import types



def instructions(code):
    """Iterates over a code string yielding integer [op, arg] pairs

    If the opcode does not take an argument, just put None in the second part
    """
    code = map(ord, code)
    i, L = 0, len(code)
    extended_arg = 0
    while i < L:
        op = code[i]
        i+= 1
        if op < opcode.HAVE_ARGUMENT:
            yield [op, None]
            continue
        oparg = code[i] + (code[i+1] << 8) + extended_arg
        extended_arg = 0
        i += 2
        if op == opcode.EXTENDED_ARG:
            extended_arg = oparg << 16
            continue
        yield [op, oparg]


def write_instruction(inst):
    """Takes an integer [op, arg] pair and returns a list of character bytecodes"""
    op, oparg = inst
    if oparg is None:
        return [chr(op)]
    elif oparg <= 65536L:
        return [chr(op), chr(oparg & 255), chr((oparg >> 8) & 255)]
    elif oparg <= 4294967296L:
        # The argument is large enough to need 4 bytes and the EXTENDED_ARG opcode
        return [chr(opcode.EXTENDED_ARG),
                chr((oparg >> 16) & 255),
                chr((oparg >> 24) & 255),
                chr(op),
                chr(oparg & 255),
                chr((oparg >> 8) & 255)]
    else:
        raise ValueError("Invalid oparg: {0} is too large".format(oparg))


def add_self(f):
    """Add self to a method

    Creates a new function by prepending the name 'self' to co_varnames, and      
    incrementing co_argcount and co_nlocals. Increase the index of all other locals
    by 1 to compensate. Also removes 'self' from co_names and decrease the index of 
    all names that occur after it by 1. Finally, replace all occurrences of 
    `LOAD_GLOBAL i,j` that make reference to the old 'self' with 'LOAD_FAST 0,0'.   

    Essentially, just create a code object that is exactly the same but has one more
    argument. 
    """
    code_obj = f.func_code
    try:
        self_index = code_obj.co_names.index('self')
    except ValueError:
        raise NotImplementedError("self is not a global")

    # The arguments are just the first co_argcount co_varnames
    varnames = ('self', ) + code_obj.co_varnames   
    names = tuple(name for name in code_obj.co_names if name != 'self')

    code = []

    for inst in instructions(code_obj.co_code):
        op = inst[0]
        if op in opcode.haslocal:
            # The index is now one greater because we added 'self' at the head of
            # the tuple
            inst[1] += 1
        elif op in opcode.hasname:
            arg = inst[1]
            if arg == self_index:
                # This refers to the old global 'self'
                if op == opcode.opmap['LOAD_GLOBAL']:
                    inst[0] = opcode.opmap['LOAD_FAST']
                    inst[1] = 0
                else:
                    # If `self` is used as an attribute, real global, module
                    # name, module attribute, or gets looked at funny, bail out.
                    raise NotImplementedError("Abnormal use of self")
            elif arg > self_index:
                # This rewrites the index to account for the old global 'self'
                # having been removed.
                inst[1] -= 1

        code += write_instruction(inst)

    code = ''.join(code)

    # type help(types.CodeType) at the interpreter prompt for this one   
    new_code_obj = types.CodeType(code_obj.co_argcount + 1,
                                  code_obj.co_nlocals + 1,
                                  code_obj.co_stacksize,
                                  code_obj.co_flags, 
                                  code,
                                  code_obj.co_consts,
                                  names, 
                                  varnames, 
                                  '<OpcodeCity>',
                                  code_obj.co_name,  
                                  code_obj.co_firstlineno,
                                  code_obj.co_lnotab, 
                                  code_obj.co_freevars,
                                  code_obj.co_cellvars)


    # help(types.FunctionType)
    return types.FunctionType(new_code_obj, f.func_globals)



class Test(object):

    msg = 'Foo'

    @add_self
    def show(msg):
        print self.msg + msg


t = Test()
t.show('Bar')
aaronasterling
  • 68,820
  • 20
  • 127
  • 125
  • Aaron, your original suggestion works fine for my use case (there will only ever be one instance of a given class at a time), but it's always good to see a way to do it better. More to the point, I'm certain I'll learn a lot figuring out what the heck you've done here. I'm pretty shocked and impressed that both you and martineau have been continuing to gnaw away at this problem. :-) – kindall Nov 13 '10 at 21:11
  • @Kindall Doing it this way occurred to me when I solved [this problem](http://stackoverflow.com/questions/3908335/python-function-local-name-binding-from-an-outer-scope/3913185#3913185). This is a pretty obvious application. I'd been to lazy and then martineau started giving the thread attention. – aaronasterling Nov 13 '10 at 21:50
  • @Kindall, updated the comments. I hadn't realized how paltry they were. Should be easier to understand now. – aaronasterling Nov 13 '10 at 23:03
  • Very impressive -- I'm learning a lot, so thanks, esp for the updated comments. – martineau Nov 14 '10 at 03:30
  • FWIW, I just stumbled an entry on Michael Foord's Voidspace blog titled [Selfless Python](https://web.archive.org/web/20140616042926id_/http://www.voidspace.org.uk/python/weblog/arch_d7_2006_12_16.shtml#e583) which uses a decorator to do something similar. Equally interesting is how it can be applied to all the methods in a class definition via his [The Selfless Metaclass](https://web.archive.org/web/20130128113535id_/http://www.voidspace.org.uk/python/articles/metaclasses.shtml#the-selfless-metaclass). – martineau Feb 25 '22 at 03:24
5

little upgrade for aaronasterling's solution( i haven't enough reputation to comment it ):

def wrap(f):
    @functools.wraps(f)
    def wrapper(self,*arg,**kw):
        f.func_globals['self'] = self        
        return f(*arg,**kw)
    return wrapper

but both this solutions will work unpredictable if f function will be called recursively for different instance, so you have to clone it like this:

import types
class wrap(object):
    def __init__(self,func):
        self.func = func
    def __get__(self,obj,type):
        new_globals = self.func.func_globals.copy()
        new_globals['self'] = obj
        return types.FunctionType(self.func.func_code,new_globals)
class C(object):
    def __init__(self,word):
        self.greeting = word
    @wrap
    def greet(name):
        print(self.greeting+' , ' + name+ '!')
C('Hello').greet('kindall')
Odomontois
  • 15,918
  • 2
  • 36
  • 71
  • Nice embellishment, thanks. Too bad you can only mark one best answer. – kindall Aug 11 '10 at 02:26
  • I think both this version and @aaronasterling's original could have problems if a similarly wrapped method of another class instance from the same module was ever called from the current one -- because the global `self` binding would be changed and not restored before it returns. – martineau Nov 10 '10 at 11:02
4

Here's a one line method decorator that seems to do the job without modifying any Special attributes of Callable types* marked Read-only:

# method decorator -- makes undeclared 'self' argument available to method
injectself = lambda f: lambda self: eval(f.func_code, dict(self=self))

class TestClass:
    def __init__(self, thing):
        self.attr = thing

    @injectself
    def method():
        print 'in TestClass::method(): self.attr = %r' % self.attr
        return 42

test = TestClass("attribute's value")
ret = test.method()
print 'return value:', ret

# output:
# in TestClass::method(): self.attr = "attribute's value"
# return value: 42

Note that unless you take precautions to prevent it, a side-effect of the eval() function may be it adding a few entries -- such as a reference to the __builtin__ module under the key __builtins__ -- automatically to the dict passed to it.

@kendall: Per your comment about how you're using this with methods being in container classes (but ignoring the injection of additional variables for the moment) -- is the following something like what you're doing? It's difficult for me to understand how things are split up between the framework and what the users write. It sounds like an interesting design pattern to me.

# method decorator -- makes undeclared 'self' argument available to method
injectself = lambda f: lambda self: eval(f.func_code, dict(self=self))

class methodclass:
    def __call__():
        print 'in methodclass::__call__(): self.attr = %r' % self.attr
        return 42

class TestClass:
    def __init__(self, thing):
        self.attr = thing

    method = injectself(methodclass.__call__)

test = TestClass("attribute's value")
ret = test.method()
print 'return value:', ret

# output
# in methodclass::__call__(): self.attr = "attribute's value"
# return value: 42
martineau
  • 119,623
  • 25
  • 170
  • 301
  • unfortunately I don't think that this will extend to methods with arguments (at least not cleanly). The solution that I have posted (as crappy as it is) will. The problem seems to be a fundamental limitation of eval. – aaronasterling Nov 11 '10 at 02:24
  • also, `func_globals` is read only in that it can not be assigned to, i.e., made to point to a different dict. It can clearly be modified. – aaronasterling Nov 11 '10 at 02:29
  • This is an interesting approach, but if you pass in your own dict of globals, you can't access any other globals. I tried passing in locals, but that didn't work at all. I could create a copy of the function's `func_globals` and update it with my new variable(s), I guess, and pass that in. – kindall Nov 15 '10 at 17:07
  • ... and that's exactly what I ended up doing. I'm not passing anything in, so that issue is not a limitation for my use, but it is convenient for me to be able to "inject" more than one variable for my test methods, so Aaron's bytecode hack (clever though it is) is not quite as good a fit. – kindall Nov 15 '10 at 23:33
  • @kindall: I'm curious how you modified it to allow more than one variable to be injected -- was it by passing the decorator arguments, hardcoding them into it, having several different decorators, or what? – martineau Nov 16 '10 at 02:00
  • Basically, the container class for the method knows what variables to inject (each class has a method for creating the dictionary, which is called before each test). I considered just injecting every attribute of the instance, but decided that would be overkill. – kindall Nov 16 '10 at 02:55
  • My documentation validation framework has classes for Subversion branches, products, and books, plus some generic containers such as groups and suites. Classes can be nested (e.g. books inside a product). Users derive from the provided classes their own classes for *specific* branches, products, books, etc. The provided classes handle printing headers, counting failures, etc.; the users just write methods (tests). The main difference in the injected variables is that books have a `pdf` attribute that points to the book's PDF file (and a `compare` method) and the others don't. – kindall Nov 23 '10 at 02:05
  • 1
    BTW, I recently figured out how to create a new function from an existing function, replacing just the globals dict. With that method you can still pass arguments. [Here is an answer](http://stackoverflow.com/questions/4558104/python-evalcompile-sandbox-globals-go-in-sandbox-unless-in-def-why/4558597#4558597) I posted on another question that shows the technique. – kindall Jan 01 '11 at 00:32
3

The trick is to add 'self' to f.func_globals. This works in python2.6. I really should get around to installing other versions to test stuff like this on. Sorry for the wall of code but I cover two cases: doing it with a metaclass and doing it with a decorator. For your usecase, I think the metaclass is better since the whole point of this exercise is to shield users from syntax.

import new, functools

class TestMeta(type):
    def __new__(meta, classname, bases, classdict):
        for item in classdict:
            if hasattr(classdict[item], '__call__'):
                classdict[item] = wrap(classdict[item])
        return type.__new__(meta, classname, bases, classdict)

def wrap(f):
    @functools.wraps(f)
    def wrapper(self):
        f.func_globals['self'] = self        
        return f()
    return wrapper

def testdec(f):
    @functools.wraps(f)
    def wrapper():
        return f()
    return wrapper

class Test(object):
    __metaclass__ = TestMeta
    message = 'You can do anything in python'
    def test():
        print self.message

    @testdec
    def test2():
        print self.message + ' but the wrapper funcion can\'t take a self argument either or you get a TypeError'

class Test2(object):
    message = 'It also works as a decorator but (to me at least) feels better as a metaclass'
    @wrap
    def test():
        print self.message


t = Test()
t2 = Test2()
t.test()
t.test2()
t2.test()
aaronasterling
  • 68,820
  • 20
  • 127
  • 125
  • Thanks, that looks like pretty much exactly what I needed to know! – kindall Aug 11 '10 at 02:26
  • What about that this seems to ignore the fact that `func_globals` is a Read-only Callable Type Special Attribute according to the the [docs](http://docs.python.org/reference/datamodel.html?highlight=func_globals#the-standard-type-hierarchy)? Or does that mean the attribute itself is read-only but not the contents of what it refers to? – martineau Nov 10 '10 at 10:43
  • @martineau, this was an early python project for me and I was a rank amateur. I've actually been meaning to revisit it with a proper bytecode hack. As per the question, the attribute itself is readonly but you can mess with it pretty freely. The main problem is that it refers to the actual global environment of the module that the function was defined in and is not sequestered as I thought it was at the time that I did this. I am going to get around to doing this up proper pretty soon. – aaronasterling Nov 10 '10 at 13:10
  • @aaronasterling: A bytecode hack might not be necessary -- see the [alternative](http://stackoverflow.com/questions/3453976/how-to-get-self-into-a-python-method-without-explicitly-accepting-it/4150399#4150399) I just added. – martineau Nov 11 '10 at 00:26
  • @martineau, added my [new solution](http://stackoverflow.com/questions/3453976/how-to-get-self-into-a-python-method-without-explicitly-accepting-it/3454053#3454053). – aaronasterling Nov 13 '10 at 13:48
2

This might be a use case for decorators - you give them a small set of lego bricks to build functions with, and the complicated framework stuff is piped in via @testcase or somesuch.

Edit: You didn't post any code, so this is going to be sketchy, but they don't need to write methods. They can write ordinary functions without "self", and you could use decorators like in this example from the article I linked:

class myDecorator(object):

    def __init__(self, f):
        print "inside myDecorator.__init__()"
        f() # Prove that function definition has completed

    def __call__(self):
        print "inside myDecorator.__call__()"

@myDecorator
def aFunction():
    print "inside aFunction()"
chryss
  • 7,459
  • 37
  • 46
  • Yes, I already do use decorators to mark methods as test cases, because I need to know 1) which methods to run and 2) what order to run them in. The bulk of the framework is working; I just want to get rid of that pesky "self." – kindall Aug 10 '10 at 22:45
  • Edited, gave example how to use decorators without the need to use `self` in the function. – chryss Aug 10 '10 at 22:52
  • but self needs to refer to something within the defined functions. this doesn't accomplish that – aaronasterling Aug 10 '10 at 22:56
  • That works great for eliminating the self in the method signature, and I came up with something similar (using a function-style decorator, though) but of course there is no access to self inside the function. What I'd like is to still be able to access the instance in the method _without_ passing it in explicitly. I will edit the question to make this clearer. – kindall Aug 10 '10 at 22:59
  • I fully see that this doesn't solve the entire problem, but without seeing how the OP constructed the framework, and what exactly the users are supposed to supply, it's hard to be more specific. MAybe looking into `@contextmanager` and something along the lines of this: http://code.activestate.com/recipes/534150/ would be a good idea, too -- again a different approach. – chryss Aug 10 '10 at 23:05
  • Thanks for thinking outside the box; I will have a look and see if I can figure out a way to use a context manager for my needs. – kindall Aug 10 '10 at 23:55