30

Question

I am having trouble figuring out how to create new DataFrame column based on the values in two other columns. I need to use if/elif/else logic. But all of the documentation and examples I have found only show if/else logic. Here is a sample of what I am trying to do:

Code

df['combo'] = 'mobile' if (df['mobile'] == 'mobile') elif (df['tablet'] =='tablet') 'tablet' else 'other')

I am open to using where() also. Just having trouble finding the right syntax.

Will
  • 11,276
  • 9
  • 68
  • 76
DJElbow
  • 3,345
  • 11
  • 41
  • 52

5 Answers5

58

In cases where you have multiple branching statements it's best to create a function that accepts a row and then apply it along the axis=1. This is usually much faster then iteration through rows.

def func(row):
    if row['mobile'] == 'mobile':
        return 'mobile'
    elif row['tablet'] =='tablet':
        return 'tablet' 
    else:
        return 'other'

df['combo'] = df.apply(func, axis=1)
Viktor Kerkez
  • 45,070
  • 12
  • 104
  • 85
  • `This is usually much faster then iteration through rows.` -- doesn't this approach iterate through rows as well? – Dudelstein Apr 08 '23 at 08:29
47

I tried the following and the result was much faster. Hope it's helpful for others.

df['combo'] = 'other'
df.loc[df['mobile'] == 'mobile', 'combo'] = 'mobile'
df.loc[df['tablet'] == 'tablet', 'combo'] = 'tablet'
Eric O. Lebigot
  • 91,433
  • 48
  • 218
  • 260
tbk
  • 523
  • 4
  • 9
27

ELIF logic can be implemented with np.select or nested np.where:

import numpy as np

df['combo'] = np.select([df.mobile == 'mobile', df.tablet == 'tablet'], 
                        ['mobile', 'tablet'], 
                        default='other')
# or 
df['combo'] = np.where(df.mobile == 'mobile', 'mobile', 
                       np.where(df.tablet == 'tablet', 'tablet', 'other'))

Sample Data + Output:

   mobile  tablet   combo
0  mobile     bar  mobile
1     foo  tablet  tablet
2     foo     nan   other
3  mobile  tablet  mobile
4  mobile     nan  mobile
5     foo  tablet  tablet
6  mobile     bar  mobile
7  mobile  tablet  mobile
8  mobile     bar  mobile
9  mobile     nan  mobile
ALollz
  • 57,915
  • 7
  • 66
  • 89
  • 5
    this really should be the accepted solution; it's scalable, and it doesn't even require an additional import: `pd.np.select`. – circld Mar 22 '19 at 23:33
  • ageed - I can't get any of these calling functions to work. – GenDemo Jun 24 '21 at 00:54
  • how would you do an 'in' statement: example `df['combo'] = np.where(df.device in ('moblile, 'Mobile', 'Phone'), 'mobile' , np.where(df.device like '%tablet%', 'tablet', 'other'))` – GenDemo Jun 24 '21 at 00:56
  • 1
    @GenDemo you can use `isin` for checking if something is in a list: https://stackoverflow.com/questions/19960077/how-to-filter-pandas-dataframe-using-in-and-not-in-like-in-sql. IIU the other logic, for the like you can use `Series.str.contains('tablet', case=False)` – ALollz Jun 24 '21 at 01:06
  • @ALollz thank you. I came right with `sale_method = pd.DataFrame(model_data['Sale Method'].str.upper()) sale_method['sale_classification'] = np.where(sale_method['Sale Method'].isin (['PRIVATE']), 'private' ,np.where(sale_method['Sale Method'].str.contains('AUCTION'), 'auction' , 'other'))` – GenDemo Jun 24 '21 at 01:41
3

Adding to np.where solution :

df['col1']= np.where(df['col'] < 3, 1,np.where( (df['col'] >3 )& (df['col'] <5),2,3))

Overall Logic is :

np.where(Condition, 'true block','false block'). 

With each true/false block can in turn again be nested.

Also, Notice the & for ANDing! (not 'and')

Sfili_81
  • 2,377
  • 8
  • 27
  • 36
0

Adding to apply lambda solution :

 df['combo'] = df.apply(lambda x: 'mobile' if x['mobile'] == 'mobile' else \
                                 ('tablet' if x[''mobile']== 'tablet' else 'other'), axis=1)

Structure :

df['column_name'] = df.apply(lambda x: 'value if true 1' if x['column_name_check_1'] == 'condition_1' else 
                                      ('value if true 2' if x['column_name_check_2'] == 'condition_2' else 
                                      ('value if true 3' if x['column_name_check_3'] == 'condition_3' else 'default_value')),axis=1)

Note : axis 1 for format value as column / series

Fariliana Eri
  • 181
  • 2
  • 5