-1

Task

I would like to have a way to access the original name of a passed argument. I looked at the answers of

which seemed a bit complicated to me. Another idea is to write

import pandas as pd
my_var = pd.DataFrame(...)
def f(dataset: pd.DataFrame):
    for name, obj in globals().items():
        if id(obj) == id(dataset):
            return name
f(my_var)

which returns 'my_var'. This makes sense to me and does not seem to be brittle.

However, since I did not see such an answer anywhere, I am wondering if I am missing something and this answer is actually a bad idea.

Questions

  1. Is this code a good/valid idea?
  2. If not why is a different (which?) answer better?

What am I NOT asking

I am NOT asking how to "how to get the original variable name of variable passed to a function". I am asking whether/why my suggested code is problematic. This is a different question.

Background

I want to use this as a helper function in a data analytics task, where I use the variable's name as a label for the later plot.

import pandas as pd

def f(dataset: pd.DataFrame):
    return_dataframe = do_stuff(dataset)
    for name, obj in globals().items():
        if id(obj) == id(dataset):
            return_dataframe["group"] = name
            return return_dataframe

data_frame_for_plotting = pd.concat([f(data_train), f(data_test)])
deceze
  • 510,633
  • 85
  • 743
  • 889
Make42
  • 12,236
  • 24
  • 79
  • 155
  • `my_var = 1; my_other_var = 1; f(my_other_var)` — Fails… – deceze Mar 11 '23 at 11:33
  • 1
    Why are you attempting to do this in the first place? – deceze Mar 11 '23 at 11:34
  • Another example that fails: `f(42)`. – deceze Mar 11 '23 at 11:35
  • @deceze: In what way is my passage "Background" not sufficient to answer your question "Why are you attempting to do this in the first place?" ? – Make42 Mar 11 '23 at 11:35
  • Another example that fails: `def foo(): bar = 42; f(bar)`. – deceze Mar 11 '23 at 11:35
  • 2
    Fair enough. Variable names aren't data. Don't use variable names as data. Labels are data. Explicitly pass data as strings. Variable names are subject to change as you refactor your code, sometimes for technical necessities, sometimes for readability. Don't make the outcome of your program dependent on the names of the variables. – deceze Mar 11 '23 at 11:37
  • @deceze: I do not see what the point of the example `def foo(): bar = 42; f(bar)` is. `def foo(): bar = 42; print(bar)` also throws an error. `f(42)` is outside the scope of the supposed usage of the function `f`. If I expect a number and get a string, another function would also not work as intended. – Make42 Mar 11 '23 at 11:39
  • 1
    It's not apparent in comments, but the two lines are supposed to be within `foo`. The point being, your `f` fails when the variable it's tasked to resolve isn't `global`. – deceze Mar 11 '23 at 11:41
  • I'd say: 1) This is fundamentally always a bad idea, as the many comments in the questions you've already discovered point out. It might work for you in specific situations, but is never generalisable. 2) The possible weakness of your approach is quite obvious: it fails when two global variables hold a value with the same `id`. Have you considered how two values may have the same `id`? Especially [small integers](https://stackoverflow.com/a/15172182/476), which you use as example? – deceze Mar 11 '23 at 11:49
  • @deceze: This is only supposed to be used with pandas dataframes, otherwise the function does not work in the first place at all. – Make42 Mar 11 '23 at 11:52
  • If you place such specific restrictions on the applicability, then explicitly mention them. – deceze Mar 11 '23 at 11:56
  • @deceze: I thought that the Background made this clear, but that was writers-bias I guess. Anyway, I added this. However, knowing StackOverflow, once a question is closed, no argument or edit is going to open it anymore ;-). – Make42 Mar 11 '23 at 11:57

2 Answers2

4

Variable names aren't data. Don't use variable names as data. Labels are data. Explicitly pass data as strings. Variable names are subject to change as you refactor your code, sometimes for technical necessities, sometimes for readability. Don't make the outcome of your program dependent on the names of the variables.


Having said that, your function works iff:

  • You take care to never assign data frames to two or more variables for any reason; i.e. foo = bar will already break it.
  • Your variables must be globals in the same module, i.e. you cannot wrap your code into functions or separate it into modules.
  • You're aware that this ties the outcome of your program to your variable names, and when refactoring your code for readability or other reasons your output may also change.

If that works for you in your specific situation… well… go for it. But generally speaking this is terrifically brittle and restricts your coding freedom. Is that worth it to save a few keystrokes?

deceze
  • 510,633
  • 85
  • 743
  • 889
  • "Is that worth it to save a few keystrokes?" In data analytical protoyping, where I write a prototype per hour? Yes, absolutely! Such code will be rewritten for production code anyways. The fact that renaming the variable will rename the label in the plot is not a bug, its a feature I want for prototyping. "Your variables must be globals in the same module, i.e. you cannot wrap your code into functions or separate it into modules." That is a mayor point to consider when rewriting it for production code. – Make42 Mar 11 '23 at 12:17
  • Yeah, for rapid prototyping, fair enough maybe. But this definitely has no place in production code, IMO. Not just because of the module thing. – deceze Mar 11 '23 at 12:23
1

First off, I agree with @deceze: this is not something that is done. But if you still want to use it (as in, rapid prototyping and you double-pinky-promise to get rid of it before anyone else sees it), this removes some of the downsides of your approach: it will work as long as the argument is a variable; it does not depend on the value, and does not care about whether the variable is global or not. On the flip side, you have to pip install executing.

import inspect
import executing
import ast

def getArgName(argNo):
    thisFrame = inspect.currentframe()
    funcFrame = thisFrame.f_back
    callerFrame = funcFrame.f_back
    callNode = executing.Source.executing(callerFrame).node
    args = callNode.args
    arg = args[argNo]
    assert isinstance(arg, ast.Name), f"The first argument to {funcFrame.f_code.co_name} has to be a variable"
    return arg.id

def seek(what):
    name_of_what = getArgName(0)
    print(f"{name_of_what} is {what}")

answer = 42
seek(answer)
# => answer is 42

However... Even though it fixes some of the downsides of the naive solution, I have to reiterate that this is still not how Python is supposed to be used. Python people generally hate hacks like these.

Amadan
  • 191,408
  • 23
  • 240
  • 301
  • Thank you! What do you think, are the advantages of your answer in contrast to the answers of the two questions I linked to? – Make42 Mar 13 '23 at 20:21
  • 1
    Huh, I think I missed those links on the first read? Those questions sure have many answers :D Some are restricted to local variables, just like yours is restricted to globals; relying on value has drawbacks that @deceze spoke of; some have a good idea but are fragile (e.g. naively split source of the calling line by commas). Also, relying on source may fail if source is missing (executing from `.pyc` files alone). `sorcery` and `varname` both use `executing`, like I did, so underlying mechanism is solid though I haven't used those packages before and can't tell you what they exactly do. – Amadan Mar 14 '23 at 00:30
  • 1
    In the end, apart from `executing` black magic to identify the calling node, I think this code is simple enough to understand, and robust enough to trust. Still wouldn't actually use it though. – Amadan Mar 14 '23 at 00:31