python dataframe column apply a function

Question

I have a dataframe

import pandas as pd
data = {'A': ['SA01', '0007', 'SA06', '0198', 'SA06'], 
        'B': [2012, 2012, 2013, 2014, 2014], }
df = pd.DataFrame(data)

df = A     B
     SA01  2012
     0007  2012
     SA06  2013
     0198  2014
     SA06  2014

I want to use df.apply or other functions of pandas to add a df['C'] as follows:

df = A     B     C
     SA01  2012  M
     0007  2012  F
     SA06  2013  M
     0198  2014  F
     SA06  2014  M

If df['A'] contains substring 'SA' then df['C'] is 'M' else 'F'. How to solve?

score 2 · Accepted Answer · answered Sep 12 '18 at 12:33

Use numpy.where with boolean mask created by contains or startswith:

df['new'] = np.where(df['A'].str.contains('SA'), 'M', 'F')
#alternative solution
#df['new'] = np.where(df['A'].str.startswith('SA'), 'M', 'F')
print (df)
      A     B new
0  SA01  2012   M
1  0007  2012   F
2  SA06  2013   M
3  0198  2014   F
4  SA06  2014   M

python dataframe column apply a function

1 Answers1