1

I have a dataframe that looks like the following. The rightmost column is my desired column:

Group   Value   Target_Column   
1        0         0     
1        0         0  
1        1         1
1        2         0
2        0         0
2        1         1
2        0         0
2        1         0

How do I identify the first non-zero value in a group(Group) and then create a column that retains the first non-zero value and show all else as zeroes?

I have been trying to leverage idxmax for this as stated in this solution: Find first non-zero value in each column of pandas DataFrame

import pandas as pd
df = pd.DataFrame({'Group': [1,1,1,1,2,2,2,2], 'Value': [0,0,1,1,0,1,0,1]})
df.ne(0).idxmax()
g = df.groupby('Group').Value
g.ne(0).idxmax()
jpp
  • 159,742
  • 34
  • 281
  • 339
gibbz00
  • 1,947
  • 1
  • 19
  • 31

1 Answers1

3

Using idxmax

df['Newcol']=0
df.loc[df.Value.ne(0).groupby(df['Group']).idxmax(),'Newcol']=1
df
Out[41]: 
   Group  Value  Target_Column  Newcol
0      1      0              0       0
1      1      0              0       0
2      1      1              1       1
3      1      2              0       0
4      2      0              0       0
5      2      1              1       1
6      2      0              0       0
7      2      1              0       0
BENY
  • 317,841
  • 20
  • 164
  • 234
  • Thank you for the answer. I am noticing that for groups that have no non-zero values(all of them are 0s in `Value` column), `NewCol` shows a 1 for the first element. How do I show zeroes for all elements if there are no non-zero element? – gibbz00 Jul 29 '18 at 20:53
  • 1
    @gibbz00 doing this for protection `df.loc[df.Newcol.eq(0),'Newcol']=0 ` at the end – BENY Jul 29 '18 at 20:58
  • Thanks. I think you meant `df.loc[df.Value.eq(0),'Newcol']=0 ` – gibbz00 Jul 29 '18 at 21:08