6

I have two data frames: main and auxiliary. I am concatenating auxiliary to the main. It results in NaN in a few rows and I want to fill them, not all. Code:

df1 = pd.DataFrame({'Main':[00,10,20,30,40,50,60,70,80]})
df1 = 
   Main
0     0
1    10
2    20
3    30
4    40
5    50
6    60
7    70
8    80
df2 = pd.DataFrame({'aux':['aa','aa','bb','bb']},index=[0,2,5,7])
df2 = 
  aux
0   aa  
2   aa
5   bb
7   bb
df = pd.concat([df1,df2],axis=1)
# After concating, in the aux column, I want to fill the NaN rows in between 
# the rows with same value. Example, fill rows between 0 and 2 with 'aa', 2 and 5 NaN, 5 and 7 with 'bb'
df = pd.concat([df1,df2],axis=1).fillna(method='ffill')
print(df)

Present result:

  Main aux
0    0   aa
1   10   aa
2   20   aa
3   30   aa # Wrong, here it should be NaN
4   40   aa # Wrong, here it should be NaN
5   50   bb 
6   60   bb
7   70   bb
8   80   bb # Wrong, here it should be NaN

Expected result:

  Main aux
0    0   aa
1   10   aa
2   20   aa
3   30  NaN
4   40  NaN
5   50   bb
6   60   bb
7   70   bb
8   80  NaN
Håken Lid
  • 22,318
  • 9
  • 52
  • 67
Mainland
  • 4,110
  • 3
  • 25
  • 56

1 Answers1

9

If I understand correctly, what you want can be done like this. You want to fill the NaNs where backfill and forward fill give the same value.

ff = df.aux.ffill()
bf = df.aux.bfill()
df.aux = ff[ff == bf]
Håken Lid
  • 22,318
  • 9
  • 52
  • 67