3

I have a Excel file with a index that is merged over several rows in Excel, and when I load it in pandas, it reads the first row as the index label, and the rest (the merged cells) is filled with NaNs. How can I loop over the index so that it fills the NaNs with the corresponding index?

EDIT: Image of excel removed by request. I don't have any specific code, but I can write an example.

import pandas as pd
df = pd.read_excel('myexcelfile.xlsx', header=1)
df.head()
                     Index-header               Month
0                          Index1                   1   
1                           NaN                     2    
2                           NaN                     3    
3                           NaN                     4     
4                           NaN                     5
5                           Index2                  1
6                           NaN                     2
...
joddm
  • 589
  • 1
  • 6
  • 18
  • 1
    Please don't put images here. Read [how to make reproducible pandas examples](http://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) and put some clipboard friendly code here. Also share the code you use to read this. – Ivan Oct 24 '16 at 14:47

2 Answers2

4

Try this:

In [205]: df
Out[205]:
    Index-header  Month
0         Index1    1.0
1            NaN    2.0
2            NaN    3.0
3            NaN    4.0
4            NaN    5.0
5         Index2    1.0
6            NaN    2.0
...          NaN    NaN

In [206]: df['Index-header'] = df['Index-header'].fillna(method='pad')

In [207]: df
Out[207]:
    Index-header  Month
0         Index1    1.0
1         Index1    2.0
2         Index1    3.0
3         Index1    4.0
4         Index1    5.0
5         Index2    1.0
6         Index2    2.0
...       Index2    NaN
MaxU - stand with Ukraine
  • 205,989
  • 36
  • 386
  • 419
2
from StringIO import StringIO
import pandas as pd

txt = """Index1,1
,2
,3
Index2,1
,2
,3"""

df = pd.read_csv(StringIO(txt), header=None, index_col=0, names=['Month'])
df

enter image description here

df.set_index(df.index.to_series().ffill(), inplace=True)
df

enter image description here

piRSquared
  • 285,575
  • 57
  • 475
  • 624