1

I have a new datasat every months with a number of columns who can change.

I want to create a new column in a dataframe with a function like :

def calcul(**kwargs):

    [...]

    return result

I would create my column like that :

df['result'] = df.apply(lambda x: calcul(x['A1'], x['A2'], x['B1']), axis =  1)

But i can have this case too :

df['result'] = df.apply(lambda x: calcul(x['A1'], x['A2'], x['A3'], x['B1', x['B2']), axis =  1)

I try to create a liste of args depending of the data and enter the list with sys.stdout.write(), but it doesn't work

liste = ["x[\'A1\']", "x[\'A2\']", "x[\'B1\']"]

df['result'] = df.apply(lambda x: calcul(sys.stdout.write(", ".join(liste))), axis =  1)
Poojan
  • 3,366
  • 2
  • 17
  • 33
Kalhiren
  • 13
  • 2
  • I think you mean `*args`: https://stackoverflow.com/questions/3394835/use-of-args-and-kwargs – ALollz Feb 04 '20 at 16:13
  • It doesn't work because `df.apply()` is not expecting you to be writing to stdout – Alex W Feb 04 '20 at 16:22
  • What kind of data are you working with? What is the function meant to do? Some more context would be good. – AMC Feb 04 '20 at 23:31
  • The function is complex, its goal is to calculate an accounting provision, this is the reason why i did not enter into details. The variable number of columns depend of the pivot of an other column in the preprocessing – Kalhiren Feb 05 '20 at 07:53

1 Answers1

0

IIUC you can replace:

df['result'] = df.apply(lambda x: calcul(x['A1'], x['A2'], x['B1']), axis =  1)

With:

cols = ['A1', 'A2', 'B1']
df['result'] = df.apply(lambda x: calcul(*[x[c] for c in cols]), axis=1)

Where cols is your list of columns that changes.

The * unpacks the list, so it is equivalent to your original line.

Dan
  • 1,575
  • 1
  • 11
  • 17
  • Thanks, it works! I had try with df['result'] = df.apply(lambda x: calcul(x[c] for c in cols), axis=1) but it was wrong – Kalhiren Feb 05 '20 at 07:53