Get length of dataframe as it appears in Excel (eg. including any column levels)

Question

EDIT: Essentially I need a method compatible with single level column indexes and MultiIndex columns which will return the integer number of levels in the columns.

I am writing multiple DataFrames out to Excel on a single sheet, and would like to move the start_row on to avoid overwriting any data.

import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.randn(10, 4), columns=[list('ABCD'), list('EFGH')])
print(len(df)

len(df) will return 10; how can I also get the index levels included, so that the return value is 12?

len(df.columns.levels) will only work if the columns are a multi-index. This applies for some of my dataframes but not all. They are a mix of single level and multiindex.

Is a try except block the best approach (catching AttributeError for the single level dataframes), or is there a more elegant way?

How about using something like `length = len(df) + len(df.columns.levshape)`? — screenpaver, Jun 29 '18 at 11:30
If I haven't misunderstood, this might solve your problem: [https://stackoverflow.com/questions/38074678/append-existing-excel-sheet-with-new-dataframe-using-python-pandas](https://stackoverflow.com/questions/38074678/append-existing-excel-sheet-with-new-dataframe-using-python-pandas) — Mateo Rod, Jun 29 '18 at 11:32
@screenpaver .levshape is new to me but raises AttributeError for non-MultiIndex columns. — ac24, Jun 29 '18 at 12:08
@MateoRod my concern is knowing the correct startrow after I have written out the dataframe. This will depend on the number of column levels. Essentially I need a method compatible with single level column indexes and MultiIndex columns which will return the integer number of levels in the columns. — ac24, Jun 29 '18 at 12:12

score 1 · Accepted Answer · answered Jun 29 '18 at 16:39

The documentation of MultiIndex mentions an attribute nlevels you could use.

import pandas as pd
import numpy as np
df1 = pd.DataFrame(np.random.randn(10, 4), columns=[list('ABCD'), list('EFGH')])
print(df1.columns)
print(df1.columns.nlevels)
df2 = pd.DataFrame(np.random.randn(10, 4), columns=['A','B','C','D'])
print(df2.columns)
print(df2.columns.nlevels)

gives

MultiIndex(levels=[['A', 'B', 'C', 'D'], ['E', 'F', 'G', 'H']],
       labels=[[0, 1, 2, 3], [0, 1, 2, 3]])
2
Index(['A', 'B', 'C', 'D'], dtype='object')
1

which should answer you question concerning the number of lines to move.

Get length of dataframe as it appears in Excel (eg. including any column levels)

1 Answers1