9

I want to convert my df to an excel sheet, but also want to add a header column to categorize all the columns. Here is a screenshot without the merged column header

and the second picture shows the 'Financials' and 'Obs' headings

For reproduction:

import pandas as pd
 # Create a Pandas dataframe from some data.
df = pd.DataFrame({'Data': [10, 20, 30, 20, 15, 30, 45]})

# Create a Pandas Excel writer using XlsxWriter as the engine.
writer = pd.ExcelWriter('pandas_simple.xlsx', engine='xlsxwriter')

# Convert the dataframe to an XlsxWriter Excel object.
df.to_excel(writer, sheet_name='Sheet1')

# Close the Pandas Excel writer and output the Excel file.
writer.save()
Danish Zahid Malik
  • 541
  • 2
  • 7
  • 19

1 Answers1

9

You can create MultiIndex:

df = pd.DataFrame({
        'A':list('abcdef'),
         'B':[4,5,4,5,5,4],
         'C':[7,8,9,4,2,3],
         'D':[1,3,5,7,1,0],
         'E':[5,3,6,9,2,4],
         'F':list('aaabbb')
})

Specified new name of level with start and end column name:

L = [('OBS','A','C'), ('FIN', 'D','F')]

And then in list comprehension create tuples for MultiIndex.from_tuples:

cols = [(new, c) for new, start, end in L for c in df.loc[:, start:end].columns]

print (cols)
[('OBS', 'A'), ('OBS', 'B'), ('OBS', 'C'), ('FIN', 'D'), ('FIN', 'E'), ('FIN', 'F')]

df.columns = pd.MultiIndex.from_tuples(cols)
print (df)
  OBS       FIN      
    A  B  C   D  E  F
0   a  4  7   1  5  a
1   b  5  8   3  3  a
2   c  4  9   5  6  a
3   d  5  4   7  9  b
4   e  5  2   1  2  b
5   f  4  3   0  4  b
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • what if I want only the 'OBS' and 'FIN' not the others? – Leon Rai Feb 28 '19 at 23:58
  • @LeonRai - Not sure if understand, can you explain more? – jezrael Mar 01 '19 at 06:10
  • above example provides 2 merged labels (OBS and FIN) and the others in the row below. What if I need only the OBS and FIN only? I tried with cols = [('OBS', ''), ('OBS', ''), ('OBS', ''), ('FIN', ''), ('FIN', ''), ('FIN', '')] but it creates an extra empty row below itself. – Leon Rai Mar 03 '19 at 16:45
  • 1
    @LeonRai - duplicated columns are not recomended, because problem with select. If use `df['OBS']` it return all columns with `OBS`. But it is possiblem just assign `df.columns = ['OBS','OBS','OBS','FIN','FIN','FIN']`. Need same values in list like values in columns. – jezrael Mar 03 '19 at 16:50
  • I tried that too, it creates multiple rows each letter in a row, weird. – Leon Rai Mar 03 '19 at 16:55
  • @LeonRai - Not sure if understand. – jezrael Mar 03 '19 at 16:57
  • F in the 1st row, I in 2nd, N in 3rd and so on – Leon Rai Mar 03 '19 at 17:00
  • 1
    @LeonRai - hmmm, not sure. There is possible remove second level with empty strings by `df.columns = df.columns.droplevel(1)` – jezrael Mar 03 '19 at 17:03
  • I tried that too, interestingly when removing the below rows, it resets the merge. Well, thank you, as a workaround the excess row can be eliminated by using python-excel (openxlsx, .. ) libraries. Seems like it's not possible with pandas. – Leon Rai Mar 03 '19 at 17:14
  • Suppose I require 'OBS' to be a `field variable`, where it can take different `strings` values depending on the `dataframe's` content? – Gathide Sep 24 '21 at 10:20
  • @Gathide - Waht means `field variable` ? Can you post new question - input data, expected ouput and your code? – jezrael Sep 24 '21 at 10:22
  • Have asked the question, see https://stackoverflow.com/questions/69324898/pandas-dataframe-with-dynamic-title – Gathide Sep 25 '21 at 09:37