9

I'm using xlsxwriter and the set_column function that format the columns in my excel outputs.

However, formatting seems to be ignored when applied to the index column (or index columns in case of multi index).

I've found a workaround, so far is to introduce a fake index with reset_index then pass index=False to the to_excel function but then the nice merging feature of the multi index will be gone too.

Any ideas?

import pandas as pd
import numpy as np

from Config import TEMP_XL_FILE

def temp():
    ' temp'
    pdf = pd.DataFrame(np.random.randn(6,4), columns=list('ABCD'))
    pdf.set_index('A', drop=True, inplace=True)
    writer = pd.ExcelWriter(TEMP_XL_FILE, engine='xlsxwriter')
    pdf.to_excel(writer, 'temp')
    workbook = writer.book
    worksheet = writer.sheets['temp']
    tempformat = workbook.add_format({'num_format': '0%', 'align': 'center'})
    worksheet.set_column(-1, 3, None, tempformat)
    writer.save()

if __name__ == '__main__':
    temp()
A. Kerbrat
  • 91
  • 1
  • 3
  • 1
    Can you provide us your code? – Giordano Oct 06 '16 at 09:50
  • 1
    In XlsxWriter, and in Excel, a Cell format overrides a Row format overrides a Column format. In this case Pandas is applying a cell format (with merge) to the index cells so the `set_column()` format doesn't have an effect. I don't think it is possible to override or set the Panda's index format via the API (apart from the `datetime_format` and `date_format`), – jmcnamara Oct 06 '16 at 12:46

4 Answers4

3

I couldn't get @Max's answer to work, but the following worked for me with 1.1.5 (should work for pandas 1.*):

import pandas.io.formats.excel

pandas.io.formats.excel.ExcelFormatter.header_style = None
xjcl
  • 12,848
  • 6
  • 67
  • 89
2

The pandas ExcelWriter overwrites the XlsxWriter formats in the index columns. To prevent that, change the pandas header_style to None

header_style = {"font": {"bold": True},
                "borders": {"top": "thin",
                            "right": "thin",
                            "bottom": "thin",
                            "left": "thin"},
                "alignment": {"horizontal": "center",
                              "vertical": "top"}} 

To do that:

import pandas.io.formats.excel

pandas.io.formats.excel.header_style = None

See also

Max
  • 1,685
  • 16
  • 21
1

As far as I've understood, Pandas sets the format of the index row. There are ways to reset it, but those solutions weren't very reliable. It was also quite hard to actually format it.

Writing out the index columns with the desired format worked best for me:

import pandas as pd

# The data that we're feeding to ExcelWriter
df = pd.DataFrame(
    {
        "Col A": ["a", "a", "b", "b"],
        "Col B": ["a", "b", "c", "d"],
        "Col C": [1, 2, 3, 4],
    }
)

# The Excel file we're creating
writer = pd.ExcelWriter("pandas_out.xlsx", engine="xlsxwriter")
df.to_excel(writer, sheet_name="Sheet1", index=False) # Prevents Pandas from outputting an index

# The variables we'll use to do our modifications
workbook = writer.book
worksheet = writer.sheets["Sheet1"]

worksheet.set_row(0, 30) # Set index row height to 30

# Find more info here: https://xlsxwriter.readthedocs.io/format.html#format-methods-and-format-properties
header_format = workbook.add_format(
    {
        "bold": True,
        "valign": "vcenter",
        "align": "center",
        "bg_color": "#d6d6d6",
        "border": True,
    }
)

# Write the column headers with the defined format.
for col_num, value in enumerate(df.columns.values):
    worksheet.write(0, col_num, value, header_format)

# Set format of data
format1 = workbook.add_format({"align": "center"})
worksheet.set_column('A:Z', 10, format1) # Width of cell

writer.save()
Coco
  • 826
  • 1
  • 12
  • 20
  • 1
    You're not just writing the columns, you're _overwriting_ them from what Pandas has already set. This is not ideal. You can instead define your own formatter and keep Pandas' own default column loop. – Reinderien Jul 20 '21 at 03:25
0
  • Don't monkey-patch the library, derive from the formatter class
  • Don't bother rewriting all of the code to create the cells yourself
  • Use the intermediate "CSS-like" formatting language Pandas defines internally for cell content; this is all written out in pandas/io/excel/_xlsxwriter.py

This works with 1.1.5:

import numpy as np
import pandas as pd
from pandas.io.formats.excel import ExcelFormatter
from typing import Dict, Any

# from Config import TEMP_XL_FILE
TEMP_XL_FILE = 'headers.xlsx'

class CenteredFormatter(ExcelFormatter):
    @property
    def header_style(self) -> Dict[str, Any]:
        d = dict(super().header_style)
        d.setdefault('alignment', {})['horizontal'] = 'center'
        d.setdefault('number_format', {})['format_code'] = '0%'
        return d


def temp() -> None:
    with pd.ExcelWriter(TEMP_XL_FILE, engine='xlsxwriter') as writer:
        pdf = pd.DataFrame(np.random.randn(6, 4), columns=list('ABCD'))
        pdf.set_index('A', drop=True, inplace=True)
        formatter = CenteredFormatter(pdf)
        formatter.write(writer, sheet_name='temp')

if __name__ == '__main__':
    temp()
Reinderien
  • 11,755
  • 5
  • 49
  • 77