0

I plan to create a function and run the function with several data sets.

I would like to add a column with values for each record of the name of the data set, based on the name of the data frame passed as a parameter.

I have an example here that creates a column with values of the argument argument1, rather than the parameter name string_i_want.

Is there a way I could obtain the name of the variable passed as the parameter string_i_want and asign that as the value in the 'scenario' column rather than argument1?

import pandas as pd

string_i_want = pd.DataFrame([1, 2, 3, 4], columns=['column_name'])
def example(argument1):
    argument1['squared'] = argument1['column_name'] ** 2
    argument1['scenario'] = f'{argument1=}'.split('=')[0]
    return argument1
example(string_i_want)
    

The returned data frame is:

enter image description here

The returned data frame I would instead like to build is:

enter image description here

windyvation
  • 497
  • 3
  • 13
  • Does this answer your question? [How do I pass a variable by reference?](https://stackoverflow.com/questions/986006/how-do-i-pass-a-variable-by-reference) – Rojo Apr 07 '23 at 21:52
  • what does `f'{argument1=}'` do? – njzk2 Apr 07 '23 at 22:01
  • 1
    it is a shortcut for `f"argument1={argument1}"` @njzk2 – JonSG Apr 07 '23 at 22:03
  • 1
    It's not possible to get the name of the variable that's passed as an argument. The function just receives the value. If you need to pass the name, you have to do it as a separate argument. – Barmar Apr 07 '23 at 22:29
  • 1
    not in ordinary coding. There re hacks to get to it, but they would be error prone. – jsbueno Apr 07 '23 at 22:38

2 Answers2

1

I recognize this is super brittle but I want to just demonstrate a potential path forward using a stack trace.

import pandas as pd
import traceback
import re

def example(argument1):
    current_stack = traceback.extract_stack()
    called_by = current_stack[-2].line
    param_name = re.findall("example\(([^\)]*)\)", called_by)[0]
    argument1['squared'] = argument1['column_name'] ** 2
    argument1['scenario'] = param_name
    return argument1

string_i_want = pd.DataFrame([1, 2, 3, 4], columns=['column_name'])
print(string_i_want)
print(example(string_i_want))

In this very specific example, you will get back:

   column_name
0            1
1            2
2            3
3            4

   column_name  squared       scenario
0            1        1  string_i_want
1            2        4  string_i_want
2            3        9  string_i_want
3            4       16  string_i_want
JonSG
  • 10,542
  • 2
  • 25
  • 36
1

"variables" in Python are actually tags attached to objects in a given scope. That is contrast with other static languages where a variable is a "box" containing an object.

The implications is that both the same object you get as an argument can have more than one name in the scope of the function that called yours. Or, it could have no name at all, for example, if the object passed is inside a list, and the function call is something like myfunction(mylist[2]). The object passed would be the third item in the list, but would have no name.

So, while Python introspection capabilities make it possible to check the namespace of the caller function, and search for labels attached to a received argument, that is not a recomended practice - as there are several cases where it would fail.

What is possible, and with no drawbacks, is for you to write the name you want as part of the function call itself. A function in Python can get arbitrarily named parameters, as a dictionary.

Maybe that could work for you?

def example(**kargs):
    for name, item in kwargs.items():
         argument1['scenario'] = name

example(string_i_want=string_i_want)

you just have type the name twice on the function call, and the one before the = is the what will be used. (they can be different, of course).

jsbueno
  • 99,910
  • 10
  • 151
  • 209