2

Let us have the below code

def f(a, b, c):
    import inspect
    stack = inspect.stack()
    # How to know the original variable names i.e 'arg1, 'arg2' & 'arg3'

arg1 = arg2 = arg3 = None

f(arg1, arg2, arg3)

Now I would like to know the original variable names that were used to invoke f()

As in Alex Martelli's answer we can get the frames using inspect module. But it's not helping in this situation since I would have to parse "f(arg1, arg2, arg3)\n" which would be messy

Any alternatives?

nehem
  • 12,775
  • 6
  • 58
  • 84
  • 5
    There is no guarantee there even were any variable names. Why are you trying to do this? – user2357112 Jan 10 '19 at 04:19
  • 5
    You're going to be fighting the language design the whole way if you try this. Python isn't R. Python parameter passing passes objects, not variables, expressions, thunks, etc. – user2357112 Jan 10 '19 at 04:21
  • @user2357112 the expression of the previous frame is clearly written as `f(arg1, arg2, arg3)`. If they don't look like variable names not sure what else you imagine. – nehem Jan 14 '19 at 02:35
  • For *that* call, there were variable names, but if all you care about is *that* call, you might as well write `return ('arg1', 'arg2', 'arg3')`. For general calls, there's no guarantee any variable names were involved. – user2357112 Jan 14 '19 at 03:34
  • While that's true there is no guarantee variable names involved. It's possible to know whether variable names involved or literals involved. The `inspect` module shows you how. – nehem Jan 14 '19 at 04:37
  • `inspect` isn't as reliable as you might think. Even if you inspect the bytecode instead of trying to parse the source, and even if you get the bytecode inspection correct, it's still not going to pick up calls made from C. You can end up looking at a stack frame that has nothing to do with the call to your function. – user2357112 Jan 19 '19 at 19:49

2 Answers2

3

It might not be advisable, but this can be done - at least to some extent - by parsing the bytecode of the caller. Specifically, you would want to get the stack frame of the caller with eg. frame = inspect.stack()[-2][0], and look at frame.f_code.co_code for the raw bytecode and frame.f_lasti for the index of the last instruction executed in that frame - which will be the CALL_FUNCTION opcode which caused your function to be called. (As long as you haven't returned yet - after that f_lasti will get updated as the execution proceeds in the callers frame)

Now, parsing the bytecode isn't really hard at all, at least until you get to reconstructing control flow (which you can avoid if you can make the assumption the called doesn't use eg. the ternary operator, and or or in the arguments to the call to your function) - but while it isn't hard per se, if you haven't done anything like it before (ie. playing with the internals of things, or other marginally "low-level" stuff), it might be a tall mountain to climb, depending on the person; it likely won't be a quick exercise.

So, what about the complications? Well, for one, bytecode is an implementation detail of CPython, so this won't work in alternative interpreters like, for example, Jython. In addition, the bytecode can change from CPython version to the next, so most CPython versions will need slightly different parsing code.

Earlier, I also said "to some extent" - what did I mean by that? Well, I already mentioned handling conditionals might be hard. In addition, you can't get the variable name, if the value wasn't stored in a variable! For example, your function might be called like f(1, 2, 3), f(mylist[0], mydict['asd'], myfunc()) or f(*make_args(), **make_kwargs()) - in the first two cases you might just decide that instead of a variable name, what you really want to know is the expression that corresponds to each argument... but what about the last case? Either expression might correspond to more than one argument, and you don't know which argument came from which expression! And in general, you can't know that either - since the values came from function calls, you can't look at them without calling the functions again - and nothing guarantees the functions won't have side-effects or that they'll return the same values again. Or maybe, like earlier with the conditionals, you might be able to assume the caller doesn't do any of that.

In short, it is possible - to some degree - and it might not even be especially hard to do - at least if you know what you're doing - but it's certainly not going to be simple, fast, general nor advisable.


Edit: If, for some reason - after all that - you still feel like you'd like to do it for some reason (or if you're just curious what it'd look like), here's an extremely limited proof-of-concept to get you started. (Only works for CPython3.4, only positional arguments, only local variables. Also, for <3.4, there is no get_instructions(), so the parsing has to be done manually. Also, it gets more and more complex as you add support for more opcodes - which might depend on the results of previous opcodes)

import inspect, dis

def get_caller_arg_names():
        """Get the argument names used by the caller of our caller"""
        frame = inspect.stack()[-2][0]
        lasti, code = frame.f_lasti, frame.f_code
        insns = list(dis.get_instructions(code))
        for call_ind, insn in enumerate(insns):
                if insn.offset == lasti:
                        break
        else:
                assert False, "Frame's lasti doesn't match the offset of any of its instructions!"
        insn = insns[call_ind]
        assert insn.opcode == dis.opmap['CALL_FUNCTION'], "Frame's lasti doesn't point to a CALL_FUNCTION instruction!"
        assert not insn.arg & 0xff00, "This PoC doesn't support keyword arguments!"
        argcount = insn.arg & 0xff
        assert call_ind >= argcount, "Bytecode doesn't have enough room for loading all the arguments! (At least without magic)"
        argnames = []
        for insn in insns[call_ind-argcount:call_ind]:
                assert insn.opcode == dis.opmap['LOAD_FAST'], "This PoC only supports direct local variables (LOAD_FAST) without any hijinks!"
                argnames.append(insn.argval)
        return argnames

if __name__ == '__main__':
        def callee(arg1, arg2, arg3):
                print(get_caller_arg_names())

        def caller():
                a, b, c = 1, 2, 3
                callee(a, b, c)

        caller()
Aleksi Torhamo
  • 6,452
  • 2
  • 34
  • 44
2

That seems like a rough syntax workaround you're trying out right there. Pass a dict or a kwargs instead.

Using dict, dictionaries use key-value pairs:

def f(d):
    print(d)
    print(d.keys())

f({'arg1': None, 'arg2': None, 'arg3': None})

# Output
# {'arg1': None, 'arg2': None, 'arg3': None}
# dict_keys(['arg1', 'arg2', 'arg3'])

Using **kwargs, which accepts a variable amount of keyword arguments:

def f(**kwargs):
    d = dict(kwargs)  # cast into a dictionary
    print(d)
    print(d.keys())

    # or use it directly
    for arg in kwargs:
        print(arg, '=', kwargs[arg])

f(arg1 = None, arg2 = None, arg3 = None)

# Output
# {'arg1': None, 'arg2': None, 'arg3': None}
# dict_keys(['arg1', 'arg2', 'arg3'])
# arg1 = None
# arg2 = None
# arg3 = None

You can then lookup the argument names using the keys of the dictionary.

TrebledJ
  • 8,713
  • 7
  • 26
  • 48