1

I would like to be able to look at two rows that have the same identification number, then compare the number of children for each person and assign the larger number for both people. I was thinking of grouping by (.groupby) ID number, but I am not sure where to go from there. Specifically I am not sure how to check which numchild is larger while also replacing the smaller number with the larger one. For example:

 Index   ID             NumChil  
 0       2011000070          3   
 1       2011000070          0   
 2       2011000074          0 
 3       2011000074          1   

should turn in to:

 Index   ID             NumChil  
 0       2011000070          3   
 1       2011000070          3   
 2       2011000074          1 
 3       2011000074          1  
stav
  • 109
  • 6

1 Answers1

1

Preferred Option
You want to use groupby with transform and max

df.groupby('ID').NumChil.transform('max')

0    3
1    3
2    1
3    1
Name: NumChil, dtype: int64

You can assign back inplace with

df['NumChil'] = df.groupby('ID').NumChil.transform('max')
df

   Index          ID  NumChil
0      0  2011000070        3
1      1  2011000070        3
2      2  2011000074        1
3      3  2011000074        1

Or produce a copy with

df.assign(NumChil=df.groupby('ID').NumChil.transform('max'))

   Index          ID  NumChil
0      0  2011000070        3
1      1  2011000070        3
2      2  2011000074        1
3      3  2011000074        1

Alternative Approaches

groupby with max and map

df.ID.map(df.groupby('ID').NumChil.max())

0    3
1    3
2    1
3    1
Name: ID, dtype: int64

df.assign(NumChil=df.ID.map(df.groupby('ID').NumChil.max()))

   Index          ID  NumChil
0      0  2011000070        3
1      1  2011000070        3
2      2  2011000074        1
3      3  2011000074        1

groupby with max and join

df.drop('NumChil', 1).join(df.groupby('ID').NumChil.max(), on='ID')

   Index          ID  NumChil
0      0  2011000070        3
1      1  2011000070        3
2      2  2011000074        1
3      3  2011000074        1
piRSquared
  • 285,575
  • 57
  • 475
  • 624