0

I have a pandas dataframe like below

df1 = pd.DataFrame({'sfassaasfsa_id': [1321352,2211252],'SUB':['Phy','Phy'],'Revasfasf_Q1':[8215215210,1221500],'Revsaffsa_Q2':[6215120,525121215125120],'Reaasfasv_Q3':[20,12],'Revfsasf_Q4':[1215120,11221252]})

I am trying to do the below

a) filter the dataframe based on unique values of SUB

b) store them in multiple files

column_name = "SUB"
col_name = "Reaasfasv_Q3"
for i,j in dict.fromkeys(zip(df1[column_name], df1[col_name])).keys():
    data_output = df1.query(f"{column_name} == @i & {col_name} == @j")
    if len(data_output) > 0:
        output_path = Path.cwd() / f"{i}_{j}_output.xlsx"
        print("output path is ", output_path)
        writer = pd.ExcelWriter(output_path, engine='xlsxwriter')
        data_output.to_excel(writer,sheet_name='results',index=False)
        for column in data_output:
            column_length = max(data_output[column].astype(str).map(len).max(), len(column))
            col_idx = data_output.columns.get_loc(column)
            writer.sheets['results'].set_column(col_idx, col_idx, column_length)
        writer.save()

while the code works fine, however the problem is in adjusting the column_length

How can I adjust the column_length to be fit based on content type.

I expect my output to have a excel files with all columns are properly auto-fit.

Though it may work with my sample data, is there any other better way to auto-adjust column width using pd.ExcelWriter?

The Great
  • 7,215
  • 7
  • 40
  • 128
  • Your method is what I would have suggested and have used successfully in the times I have needed 'auto adjust' in xlsxwriter. There is a couple of different options in this SO questions if you haven't already read it https://stackoverflow.com/questions/33665865/adjust-cell-width-in-excel. I tried the one suggested by dtl85 before as slightly different option, however it did not adjust the columns properly. – moken Sep 12 '22 at 12:24

1 Answers1

1

This code works when using 'openpyxl' as your engine, sometimes pip install xlsxwriter wont solve the issue. This code below works like a charm. Edit any part as you wish.

def text_length(text):
    """
    Get the effective text length in characters, taking into account newlines
    """
    if not text:
        return 0
    lines = text.split("\n")
    return max(len(line) for line in lines)

def _to_str_for_length(v, decimals=3):
    """
    Like str() but rounds decimals to predefined length
    """
    if isinstance(v, float):
        # Round to [decimal] places
        return str(Decimal(v).quantize(Decimal('1.' + '0' * decimals)).normalize())
    else:
        return str(v)


def auto_adjust_xlsx_column_width(df, writer, sheet_name, margin=3, length_factor=1.0, decimals=3, index=False):

    sheet = writer.sheets[sheet_name]
    _to_str = functools.partial(_to_str_for_length, decimals=decimals)
    # Compute & set column width for each column
    for column_name in df.columns:
        # Convert the value of the columns to string and select the 
        column_length =  max(df[column_name].apply(_to_str).map(text_length).max(), text_length(column_name)) + 5
        # Get index of column in XLSX
        # Column index is +1 if we also export the index column
        col_idx = df.columns.get_loc(column_name)
        if index:
            col_idx += 1
        # Set width of column to (column_length + margin)
        sheet.column_dimensions[openpyxl.utils.cell.get_column_letter(col_idx + 1)].width = column_length * length_factor + margin
    # Compute column width of index column (if enabled)
    if index: # If the index column is being exported
        index_length =  max(df.index.map(_to_str).map(text_length).max(), text_length(df.index.name))
        sheet.column_dimensions["A"].width = index_length * length_factor + margin

Example of using it

with pd.ExcelWriter('report.xlsx', sheet_name="results", engine='openpyxl') as writer:
    report_df.to_excel(writer,index=False, sheet_name="results")
    auto_adjust_xlsx_column_width(report_df, writer, "results")
Imperial_J
  • 306
  • 1
  • 7
  • 23