-1

So basically, I want to update a worksheet with new data, overwriting existing cells in excel. Both files have the same column names (I do not want to create a new workbook nor add a new column).

Here I am retreiving the data that I want:

import pandas as pd
df1 = pd.read_csv
print(df1)

Ouput (I just copy and pasted the first 5 rows, there are about 500 rows total):

          Index  Type  Stage  CDID  Period  Index Value
0     812008000     6      2  JTV9  201706      121.570
1     812008000     6      2  JTV9  201707      121.913
2     812008000     6      2  JTV9  201708      121.686
3     812008000     6      2  JTV9  201709      119.809
4     812008000     6      2  JTV9  201710      119.841
5     812128000     6      1  K2VA  201706      122.030

The existing excel file has the same columns (and row total) as df1, but I just want to have the 'Index' column repopulated with the new values. Let's just say it looks like this (i.e. so I want the previous values for Index to go into the corresponding column):

          Index  Type  Stage  CDID  Period  Index Value
0     512901100     6      2  JTV9  201706      121.570
1     412602034     6      2  JTV9  201707      121.913
2     612307802     6      2  JTV9  201708      121.686
3     112808360     6      2  JTV9  201709      119.809
4     912233066     6      2  JTV9  201710      119.841
5     312128003     6      1  K2VA  201706      122.030

Here I am retrieving the excel file, and attempting to overwrite it:

from win32com.client import Dispatch
import os

xl = Dispatch("Excel.Application")
xl.Visible = True 

wbs_path = ('folder path')

for wbname in os.listdir(wbs_path):
    if not wbname.endswith("file name.xlsx"):
        continue
    wb = xl.Workbooks.Open(wbs_path + '\\' + wbname)
    sh = wb.Worksheets("sheet name")
    sh.Range("A1:A456").Value = df1[["Index"]]
    wb.Save()
    wb.Close()
xl.Quit()

But this doesn't do anything. If I type in strings, such as:

h.Range("A1:A456").Value = 'o', 'x', 'c'

This repeats o in cells through A1 through to A456 (it updates the spreadsheet), but ignores x and c. I have tried converting df1 into a list and numpy array, but this doesn't work.

Does anyone know a solution or alternative workaround?

DanONS
  • 195
  • 1
  • 2
  • 7

1 Answers1

1

If the index of the dataframe is the same you can update columns by using update(). It could work like this:

df1.update(df2['Index'].to_frame())

Note: the to frame() is probably not needed

EDIT:

Since you try to update a excel-file and not a dataframe, my answer is probably not enough. For this part I would suggest to load the file into a dataframe, update the data and save it.

df1 = pd.read_excel('file.xlsx', sheet_name='sheet_name')

# do the update

writer = pd.ExcelWriter('file.xlsx')
df1.to_excel(writer,sheet_name='sheet_name', engine='xlsxwriter')
writer.save()
EyJay
  • 180
  • 1
  • 8
  • Ah right, so do you know if it's possible to save the updated spreadsheet to overwrite the older file? If I name the file with the same name as the older version, will it overwrite? – DanONS Dec 07 '17 at 10:54
  • Take a look at this question: https://stackoverflow.com/questions/42370977/how-to-save-a-new-sheet-in-an-existing-excel-file-using-pandas Only the sheet will be overwritten - the file stays the same. But still I would make a copy of the sheet, if it is an iimportant one ;) – EyJay Dec 07 '17 at 11:01
  • Cool, that was interesting, I tried it out. While it does overwrite the sheets, I wanted sheet 2 and sheet 3 to remain intact. Anyway it might be a step in the right direction so I will see if it can do what I need – DanONS Dec 07 '17 at 11:43