Passing columns dynamically as arguments to a function in pandas with list comprehension

Question

I have following dataframe:

p     s
ABCD  AB,AC,AD
XY    XY   
MSD   MS,MD
PQRS  PQ,PR,PS

I'm using following syntax to split column s into column s0,s1,s2....

df = df.join(df['s'].str.split(',', expand=True).add_prefix('s').fillna(np.nan))

which wull result in

p     s         s0    s1    s2
ABCD  AB,AC,AD  AB    AC    AD 
XY    XY        XY    NaN   NaN
MSD   MS,MD     MS    MD    NaN
PQRS  PQ,PR,PS  PQ    PR    PS

Now I want to pass these newly generated column values into a function along with some other column values. For Eg:

def compare(p,s0,s1,s2):
    //piece of code

Suppose the number of columns generated(Say one time 13, means s0,s1,s2,...s12 and another time 15, s0,s1,...,s13) varies from dataset to dataset(depends on number of fields present in column s separated by commas). Is there a way so that I can pass these column values dynamically to function on basis of number of columns created?

Something like following: def compare(p,[list comrehension])

Can I get any suggstions??

If you want dynamic number of arguments in the function then `**kwargs` is an option. — Arpit Solanki, Jan 31 '18 at 13:24
You can pack arguments using a "starred" argument. `def compare(p, *cols):` — James, Jan 31 '18 at 13:25
My question is about how to pass large arguments in a function call and not to handle using *args or **kwargs — Avinash Clinton, Jan 31 '18 at 14:05
I'm already using *args to handle variable arguments in my function... But how can I pass multiple columns(s0,s1,s2,........,s35) as arguments to a function call in pandas — Avinash Clinton, Jan 31 '18 at 14:07

score 1 · Accepted Answer · answered Jan 31 '18 at 16:26

You could use the Index.difference method to generate a list of the new columns:

new_columns = df.columns.difference(old_columns).tolist()

For example,

import numpy as np
import pandas as pd

def compare(p, new_columns):
    print(new_columns)

df = pd.DataFrame({'p': ['ABCD', 'XY', 'MSD', 'PQRS'],
                   's': ['AB,AC,AD', 'XY', 'MS,MD', 'PQ,PR,PS']})

old_columns = df.columns
df = df.join(df['s'].str.split(',', expand=True).add_prefix('s').fillna(np.nan))
new_columns = df.columns.difference(old_columns).tolist()

compare(df['p'], new_columns)

prints

['s0', 's1', 's2']

Thanks.. I was trying to find the max length of split of column s and then I was trying with list comprehenson [s+str(i) for i in range(max)]. But It was passing string(obviously)... — Avinash Clinton, Jan 31 '18 at 16:55

Passing columns dynamically as arguments to a function in pandas with list comprehension

1 Answers1