-2

I have a code below that works well without error. I can just call the column name "label" without using the dataframe syntax of df['label'] in the where statement

Working Code However, when I used it inside a function as shown below, this same set of logic seems to be not working and I'm getting an error that the column "label" is not defined. Can I know why is this not working and how to make this work without using the df['label'] syntax ? Thanks

Error in a function

Ben
  • 1

1 Answers1

0

That's because in the enviroment of the funtion the variable label wasn't defined or isn't a global variable. You will find helpful info about variables inside and outside a function in these links: link 1,link 2. You can try something like this:

import pandas as pd

def func(f):
      label = f.iloc[:, 0]   #get the first column
      for col in f.columns:
        vars()[col]=f[col]
      f = f.where(label== 'A')
      print(f)   

df = pd.DataFrame({'label':['A','B','A'], 'value':[1,2,3]}) 

func(df)

>>>         label  value
        0     A      1
        1     NaN    NaN
        2     A      3

If you are trying to only get the rows based on a specific condition, you can do something like this:

import pandas as pd

def func(f):
  for col in f.columns:
    vars()[col]=f[col]
  f = f.loc[f['label'] == 'A']
  print(f)

df = pd.DataFrame({'label':['A','B','A'], 'value':[1,2,3]}) 
func(df)


>>>         label  value
        0     A      1
        2     A      3

You can checkout this way and another ways to select rows based on column values in this link.

MrNobody33
  • 6,413
  • 7
  • 19