Convert a xlsx file with multiple sheets to multiple xlsx files

Question

I have an .xlsx file with multiple sheets and I want to split all the sheets into different .xlsx files. For example, I have a file, matt.xlsx and it has 5 sheets with names A,B,C,D,E. I want to split them into 5 files naming them a.xlsx, b.xlsx, c.xlsx, d.xlsx and e.xlsx.

The code I have written is as below but it does not write anything to the split files and have an error when I try to open them saying "Format of the file does not match"

import os
import csv
from openpyxl import load_workbook
import pandas as pd


def main():

# read wbook from xlsx, 
filepath = os.path.join(os.getcwd())
wb = load_workbook(filename='matt.xlsx')
dest_dir = os.path.join('desktop','new files')
os.makedirs(dest_dir, exist_ok=True)

# loop the sheet
for sheet in wb.worksheets:
    sheet_names = sheet.title.split(" ")
    sheet_names = [x.lower() for x in sheet_names]

    dest_path = os.path.join(dest_dir, sheet_names.xlsx')

    with open(dest_path, 'w') as f:
        writer = pd.ExcelWriter(f, engine='xlsxwriter')
        reader = sheet.rows
        next(reader, None)

Thank you in advance. I am a beginner using python and I would be grateful if anyone could help.

Matt did either of the answers below help you get to a resolution? — rs311, Nov 16 '18 at 19:46

score 5 · Accepted Answer · answered Nov 14 '18 at 02:57

I modified some of your code. You can do this without using pandas. After looking through some of the openpyxl documentation it looks like the below code could be a starting point for what you are trying to do.

This will copy the values of the worksheets to new workbooks but none of the formatting, etc.

See if the below code is a starting point for what you are trying to do.

from openpyxl import load_workbook, Workbook

wb = load_workbook(filename='matt.xlsx')

for sheet in wb.worksheets:
    new_wb = Workbook()
    ws = new_wb.active
    for row_data in sheet.iter_rows():
        for row_cell in row_data:
            ws[row_cell.coordinate].value = row_cell.value

    new_wb.save('{0}.xlsx'.format(sheet.title))

Please mark this as the answer if it helped you solve your problem. — AlwaysData, Nov 14 '18 at 11:23

rs311 · Answer 2 · 2018-11-14T03:15:50.527

If you did want to use pandas, I'd recommend something like the below. This will read in your data and create a file for each sheet which exists in your workbook.

Here's a link to the 10 minutes to pandas tutorial and the documentation for ExcelFile.

import os
import pandas as pd

# create an excel file object
reader = pd.ExcelFile('matt.xlsx')

# loop through all sheet names
for sheet in reader.sheet_names:
    #read in data
    df = pd.read_excel(path, sheet_name=sheet)
    # save data to excel in this location
    # '~/desktop/new files/a.xlsx', etc.
    df.to_excel(os.path.join('desktop','new files', sheet), index=False)

Convert a xlsx file with multiple sheets to multiple xlsx files

2 Answers2

Linked