Dataframe to excel sheet not working as expected

Question

I have a dataframe with certain variables. When I print the dataframe within the loop, I see all the different rows of data. But when I try to write this data into a spreadsheet on Excel, only the last row is written.

file1r = open("Catastrophe Reinsurance.txt","r")
lines_counter_18 = 1
for line in file1r:
    line = line.rstrip("\n")
    if lines_counter_18 > 9:
       year = line[0:4]
       year_2013 = line[7:18]
       year_2014 = line[21:32]
       year_2015 = line[35:46]
       year_2016 = line[49:60]
       year_2017 = line[63:74]
       year_2018 = line[77:88]
       df_1r = pd.DataFrame({'year':year,'year_2013':year_2013,'year_2014':year_2014,'year_2015':year_2015,'year_2016':year_2016,'year_2017':year_2017,'year_2018':year_2018},index=[0])
       #print(df_1r)
    lines_counter_18 += 1
    df_1r.to_excel(writer,sheet_name='Catastrophe Reinsurance',index=True)

I define the writer outside of the loop as : writer = pd.ExcelWriter('test.xlsx', engine='xlsxwriter')

I save the workbook outside of the loop as well. The looping logic does not seem to work since only the last row is written to the sheet. Am I missing something?

this code does not execute, please provide a Minimal, Reproducible Example — Setop, Nov 24 '19 at 22:03

score 0 · Answer 1 · answered Nov 25 '19 at 00:04

to_excel will overwrite the file so you're only getting the df_1r from the last iteration. There are ways to append to an excel sheet rather than overwrite (Append existing excel sheet with new dataframe using python pandas) but if you're not worried about running out of memory it's much simpler to build a single data frame by appending rows to your dataframe within the loop (see pandas append) and then write the whole dataframe to the excel file once outside of the loop.

Also, as your loop currently stands you will be creating duplicate rows when lines_counter_18 is less than 9 (df_1r will exist from the previous iteration in the loop and get written again).

score 0 · Answer 2 · answered Nov 25 '19 at 00:24

I guess you need to concat each data frame to a one final data frame and save.otherwise each df_1r will overwrite the previous data frame.

file1r = open("Catastrophe Reinsurance.txt","r")
lines_counter_18 = 1
final = pd.DataFrame()
for line in file1r:
    line = line.rstrip("\n")
    if lines_counter_18 > 9:
       year = line[0:4]
       year_2013 = line[7:18]
       year_2014 = line[21:32]
       year_2015 = line[35:46]
       year_2016 = line[49:60]
       year_2017 = line[63:74]
       year_2018 = line[77:88]
       df_1r = pd.DataFrame({'year':year,'year_2013':year_2013,'year_2014':year_2014,'year_2015':year_2015,'year_2016':year_2016,'year_2017':year_2017,'year_2018':year_2018},index=[0])
       #print(df_1r)
    final = pd.concat([final,df_1r])
    lines_counter_18 += 1

final.to_excel(writer,sheet_name='Catastrophe Reinsurance',index=True)

Dataframe to excel sheet not working as expected

2 Answers2