1

I need some help to modify my function and how to apply it in order to iterate an ifelse condition through multiple features.

Suppose we have the following table t1

import pandas as pd
names = {'name': ['Jon','Bill','Maria','Emma']
         ,'feature1': [2,3,4,5]
         ,'feature2': [1,2,3,4]
         ,'feature3': [1,2,3,4]}
t1 = pd.DataFrame(names,columns=['name','feature1','feature2','feature3'])

I want to create 3 new columns based on an ifelse condition. Here is how I am doing it for the first feature:

# Define the conditions
def ifelsefunction(row):
    if row['feature1'] >=3:
        return 1
    elif row['feature1'] ==2:
        return 2
    else:
        return 0

# Apply the condition
t1['ft1'] = t1.apply(ifelsefunction, axis=1)

I would like to write the function into something iterable like this

def ifelsefunction(row, feature):
    if row[feature] >=3:
        return 1
    elif row[feature] ==2:
        return 2
    else:
        return 0

t1['ft1_score'] = t1.apply(ifelsefunction(row, 'feature1'), axis=1)
t1['ft2_score'] = t1.apply(ifelsefunction(row, 'feature2'), axis=1)
t1['ft3_score'] = t1.apply(ifelsefunction(row, 'feature3'), axis=1)

---- EDIT ----

Thanks for the answers, I may have over-simplified the actual problem.

How do I do the same for this conditions?

def ifelsefunction(var1, var2):
    mask1 = (var1 >=3) and (var1<var2)
    mask2 = var1 == 2
    return np.select([mask1,mask2], [var1*0.7, var1*var2], default=0)
Gabriel
  • 11
  • 2
  • Does this answer your question? [python pandas: apply a function with arguments to a series](https://stackoverflow.com/questions/12182744/python-pandas-apply-a-function-with-arguments-to-a-series) – dimay Sep 21 '21 at 05:57

2 Answers2

1

I think here is best avoid loops, use numpy.select for test and assign mask only for selected columns from list, for pass function with input DataFrame is used DataFrame.pipe:

# Define the conditions
def ifelsefunction(df):
    m1 = df >= 3
    m2 = df == 2
    return np.select([m1, m2], [1, 2], default=0)


cols = ['feature1','feature2','feature3']
t1[cols] = t1[cols].pipe(ifelsefunction)
#alternative
#t1[cols] = ifelsefunction(t1[cols])

print (t1)
    name  feature1  feature2  feature3
0    Jon         2         0         0
1   Bill         1         2         2
2  Maria         1         1         1
3   Emma         1         1         1

For new columns use:

# Define the conditions
def ifelsefunction(df):
    m1 = df >= 3
    m2 = df == 2
    return np.select([m1, m2], [1, 2], default=0)


cols = ['feature1','feature2','feature3']
new = [f'{x}_score' for x in cols]

t1[new] = t1[cols].pipe(ifelsefunction)
#alternative
#t1[new] = ifelsefunction(t1[cols])

print (t1)
    name  feature1  feature2  feature3  feature1_score  feature2_score  \
0    Jon         2         1         1               2               0   
1   Bill         3         2         2               1               2   
2  Maria         4         3         3               1               1   
3   Emma         5         4         4               1               1   

   feature3_score  
0               0  
1               2  
2               1  
3               1  

EDIT:

You can change function like:

def ifelsefunction(df, var1, var2):
    mask1 = (df[var1] >=3) & (df[var1]<df[var2])
    mask2 = df[var1] == 2
    return np.select([mask1,mask2], [df[var1]*0.7, df[var1]*df[var2]], default=0)


t1['new'] = ifelsefunction(t1, 'feature3','feature1')
print (t1)
    name  feature1  feature2  feature3  new
0    Jon         2         1         1  0.0
1   Bill         3         2         2  6.0
2  Maria         4         3         3  2.1
3   Emma         5         4         4  2.8
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • Thanks, this works well for my example. However, I might have a made a mistake of oversimplifying my actual problem. The conditions need to rely on more than 1 feature, and the expected output is a computation of 2 features. - I've made an edit to the original question, please help if you can! – Gabriel Sep 21 '21 at 14:22
  • @Gabriel - Answer was edited. – jezrael Sep 22 '21 at 05:15
0

Try using apply

t1["ft1_score"] = t1.feature1.apply(lambda x: 1 if x >= 3 else (2  if x == 2 else 0))
t1["ft2_score"] = t1.feature2.apply(lambda x: 1 if x >= 3 else (2  if x == 2 else 0))
t1["ft3_score"] = t1.feature3.apply(lambda x: 1 if x >= 3 else (2  if x == 2 else 0))
Raymond Toh
  • 779
  • 1
  • 8
  • 27