pandas - add column based on conditions

Question

Starting from a simple dataframe df like:

C,n
AAA,1
AAA,2
BBB,1
BBB,2
CCC,1
CCC,2
DDD,1
DDD,2

I would like to add a column based on some conditions on values in the C column. The column I would like to add is:

df['H'] = df['n'] / 10

which returns:

     C  n    H
0  AAA  1  0.1
1  AAA  2  0.2
2  BBB  1  0.1
3  BBB  2  0.2
4  CCC  1  0.1
5  CCC  2  0.2
6  DDD  1  0.1
7  DDD  2  0.2

Now I would like to add the same column but with a different normalization factor only for values CCC and DDD in column C, as, for instance:

df['H'] = df['n'] / 100

so that:

     C  n    H
0  AAA  1  0.1
1  AAA  2  0.2
2  BBB  1  0.1
3  BBB  2  0.2
4  CCC  1  0.01
5  CCC  2  0.02
6  DDD  1  0.01
7  DDD  2  0.02

So far I tried to mask the dataframe as:

mask = df['C'] == 'CCC'
df = df[mask]
df['H'] = df['n'] / 100

and that worked on the masked sample. But since I have to apply several filters keeping the original H column for non-filtered values I'm getting confused.

score 3 · Accepted Answer · answered Nov 03 '15 at 14:48

3

df.loc[df['C'] == 'CCC' , 'H'] = df['n'] / 100

answered Nov 03 '15 at 14:48

Nader Hisham

5,214
4
19
35

score 2 · Answer 2 · answered Nov 03 '15 at 14:39

2

Can can also use iloc

df.ix[df['C'].isin(['CCC','DDD']),['H']] =  df['n'] / 100

answered Nov 03 '15 at 14:39

steboc

1,161
1
7
17

score 1 · Answer 3 · edited May 23 '17 at 12:07

1

Using the examples in this answer you can use:

df['H'][mask] = df['H'][mask]/100

You could also calculate the H column separately based ('CCC'/'DDD' or not 'CCC'/'DDD'):

import numpy as np
mask = np.logical_or(df['C'] == 'CCC', df['C']=='DDD')
not_mask = np.logical_not(mask)
df['H'][not_mask] = df['H'][not_mask]/10
df['H'][mask] = df['H'][mask]/100

edited May 23 '17 at 12:07

Community

1
1

answered Nov 03 '15 at 14:38

agold

6,140
9
38
54

pandas - add column based on conditions

3 Answers3