0
import openpyxl

def read_excel():
    # Create a new workbook and add a worksheet
    workbook = openpyxl.Workbook()
    worksheet = workbook.create_sheet('Sheet1')

    # Use a while loop to generate and write the rows one at a time
    row = 1
    n = 1
    while row <= 6306456:  # 1048576*6+15000
        # If the row number is greater than 1,048,576, create a new worksheet
        # and reset the row number to 1

        c = row
        
        # if row > 1048576:
        if c > (1048576 * n):
            worksheet = workbook.create_sheet(f'Sheet{row // 1048576 + 1}')
            c = c - (1048576 * n)
            n = (row // 1048576) + 1

           
        # Write the row number to the first column of the worksheet
        worksheet.cell(row=c, column=1).value = row
        row += 1


        # Print the progress of the operation
        print(f'{row} of 6306456 rows written')

    # Save the workbook
    workbook.save('row_file.xlsx')

if __name__ == '__main__':
    # Call the read_excel function
    read_excel()

Above is the script,but it doesn't work,and it showed row is over 1048576.

I have 1048576*6+15000=6306456 rows of data to write in a single .xlsx file.

Could someome please help to fix the script when using openpyxl to create another new worksheets when a sheet reached 1048576 rows.

Therefore,the result is a .xlsx file with 7 worksheets to store more than 6306456 rows.

Thanks a lot.

Stanley1
  • 3
  • 2
  • 2
    Why are you trying to save 6M rows to an Excel file? Which *human* is going to use this? Excel isn't a database. You can save to multiple sheets but that makes sense when you want to save different data or you want to group the data by eg category, product, date. – Panagiotis Kanavos Dec 07 '22 at 13:50
  • Where does the data come from and what does it contain? If you use Pandas, you can split the dataframe into chunks and save each to a different sheet. [These answers](https://stackoverflow.com/questions/44729727/pandas-slice-large-dataframe-into-chunks) show how to slice a big dataframe by size or fields. [This page from the XlsxWriter docs](https://xlsxwriter.readthedocs.io/example_pandas_multiple.html) shows how to save each slice to a different sheet – Panagiotis Kanavos Dec 07 '22 at 13:55
  • I just wanna test openpyxl to split multi-sheets in a single xlsx file.Thanks – Stanley1 Dec 07 '22 at 13:58
  • 2
    Why? That's not "just". Excel files are used by humans, either to read data or for analysis. They aren't meant for data transfer. A human would have *real* trouble trying to combine 6 different sheets into a single pivot table. Applications and developers would hate such a file too, because they'd have to write extra code to iterate sheets – Panagiotis Kanavos Dec 07 '22 at 13:58
  • The data source matters too. If the data source is Pandas, you can split with 1 line (`list_df = np.array_split(df, n)`)and save with 4 more lines, 1 to create an ExcelWriter, 2 to loop over `list_df` and write to the writer, and 1 to save the file. The sheet would still be very unwieldy. – Panagiotis Kanavos Dec 07 '22 at 14:03
  • Dear Panagiotis Kanavos,could you please provide a example code for np.array_split? Thanks a lot. – Stanley1 Dec 07 '22 at 14:06
  • I just did - it's just that single line. Where *does* the data come from though? There are a *lot* of techniques that can be used. If you use Pandas, you could split the data while loading them by using `chunksize` for example. You wouldn't need `np.array_split`. Or you could group the dataframe by a specific column and get meaningful sheets – Panagiotis Kanavos Dec 07 '22 at 14:08
  • But my script is first read excel with pandas,and then to_numpy,and when getting the data I want,then write it to new xlsx with openpyxl.chunksize has to be used when reading the original xlsx file. Not the result xlsx file? – Stanley1 Dec 07 '22 at 14:23
  • Could openpyxl writing cells be combined with pandas' chunksize function in the same time? – Stanley1 Dec 07 '22 at 14:29
  • Use a counter (`enumerate` springs to mind), together with the remainder operator (`%`) to create sheets and write to them. – Charlie Clark Dec 07 '22 at 16:56

0 Answers0