2

I have to read a certain .xlsx file (file1.xlsx), pull data from another .csv files (file2.csv) and write the processed data into the original file (file1.xlsx) along with some charts. The file has existing charts, I have looked into openpyxl, xlswriter, xlsread, and other python libraries.

It seems openpyxl is the only library which allows you to read and write files back with the same library. Others like xlswriter only write, or xlsread only read.

However, even openpyxl is limited, in that if I read a file which already has charts, the charts are lost.

The only way I can think of doing this from python only, is using a combination of openpyxl and pywin32. 1. extract data from file2.csv, process data and printout to another .xlsx file (or hold in memory) 2. copy data from memory or from file3.xlsx and past into file1.xlsx using pywin32.

Before, I dive into pywin32, I wanted to check if this is a feasible strategy. I am working in a Linux environment, so would have preferred to stay in this environment while doing the data processing, but I dont know whether there is another way out.

Has anybody read an excel workbook with charts and written data out to the same excel workbook with openpyxl or other libraries (WITHOUT LOSING THE CHARTS)?

alpha_989
  • 4,882
  • 2
  • 37
  • 48
  • 1
    PyWin32 would work on Windows only. It uses Excel COM interface and requires installed MS Excel. So charts can't be lost. Not sure why xlrd + xlsxWriter doesn't work for you. Never met such a task. – Vasily Ryabov Sep 24 '17 at 04:30
  • Thanks @VasilyRyabov. I mainly looked into detail into openpyxl. For xlsxwriter/xlrd, I did a cursory overview.. maybe I missed something. I know xlsxwriter cant read files, so xlrd has to be used to read existing files. Can xlrd read the workbook and pass it to xlswriter? Because these are separate libraries, I assumed that they are not compatible with each other and dont talk to each other. Can Xlrd read a workbook with charts and pass it to xlswriter to write out to? – alpha_989 Sep 24 '17 at 19:28
  • I have also though about xlwings. Is there any advantage of xlwings over pywin32, if I do have to go that route? I know the pywin32/xlwings using Excel COM, that is why I was trying to avoid that route.. – alpha_989 Sep 24 '17 at 19:33
  • Never did a detailed research about the best tool here. Sorry. Just have some experience with pyWin32 + Excel COM objects. And simple use of xlsxWriter. – Vasily Ryabov Sep 24 '17 at 19:43

2 Answers2

3

Preservation of charts is a feature that is being introduced in openpyxl version 2.5 (see here and here). So you could try to install and use the 2.5 development version of openpyxl.

Xukrao
  • 8,003
  • 5
  • 26
  • 52
  • I tried from my linux environment.. this doenst seem to work, even with 2.5.0a3. Having a chart in the .xlsx leads Excel2016 to recognize an error in the .xlsx file. When it corrects it the chart is gone. Do I need to operate in a windows environment to make this work? – alpha_989 Sep 24 '17 at 22:26
  • Oh, too bad. I haven't tried out the 2.5.0a3 version myself, but I would be very surprised if the OS that is used makes any difference at all. It might be that the chart preservation feature has a few bugs that still have to be ironed out, since it's still part of the development version. – Xukrao Sep 25 '17 at 23:29
0

Panda's maybe able to read and write to Excel. It uses xlrd to read excel files and xlswriter to write it out. So it seems that it can load the Excel file usign xlrd, into memory and pass it out to xlsxwriter, to save the file.

https://pandas.pydata.org/pandas-docs/stable/io.html#io-excel-writer

alpha_989
  • 4,882
  • 2
  • 37
  • 48