1

I have an xlsx that has two sheets: on has some data in G1:O25 (let's call this "data") and one that has some images inserted into cells in G1:O25 (let's call this one "images").

My goal is to use Python to filter the data using images. I want a popup that shows me image from cell G1 along with a checkbox or something to include/exclude this data point. Then create a new sheet ("filtered data") with the included data points.

I'm new to Python so bear with me, but I've figured out a couple things from searching:

  1. I can load the data into a list.
  2. xlsx files are actually zip files so I can use zipfile and matplotlib to read the images from subdirectories display them.
  3. It shouldn't be hard to add the checkbox thing and do the filtering.

The issues I am having:

  1. Since openpyxl does not preserve the images when reading/writing to a workbook, I would loose the images when I append my "filtered data" sheet. Maybe there is a workaround like saving to a seperete sheet and using COM?
  2. Although I can load the images using the zip method, I lose information on which cell they are associated with. They are in a logical order inside the xlsx/zip file, but sometimes there will be a missing image (i.e. say cell K11 does not have an image) so I cannot just assume that image1.jpeg corresponds to cell G1 and so on and so forth). I am not sure where in the excel file I can find info associating images to their respective cells in the spreadsheet.

Thank you in advance

LoveToCode
  • 788
  • 6
  • 14
  • Is using the Excel's COM interface not an option? – ivan_pozdeev Aug 25 '18 at 21:03
  • Possible duplicate of [how to get the relative position of shapes within a worksheet](https://stackoverflow.com/questions/22773816/how-to-get-the-relative-position-of-shapes-within-a-worksheet) – ivan_pozdeev Aug 25 '18 at 21:22
  • It's not that COM is not an option, I just know it's probably not going to be easy and since I'm learning python I'd rather not have to also learn VBA. Thank you for the suggestion. – LoveToCode Aug 25 '18 at 21:49
  • When you use COM from Python, you aren't using VBA; you are using the objects, methods and properties of whatever object model you are targeting (in this case, the Excel application object model). See [here](https://stackoverflow.com/a/13509702/111794). – Zev Spitz Aug 26 '18 at 04:33
  • Yeah, makes sense. That code looks very similar to COM I've used before to control IE from AutoHotkey. I think I'll use that to write out the "filtered data". I figure out how to parse cell locations from the xml files contained inside the .zip/.xlsx. – LoveToCode Aug 26 '18 at 07:22
  • openpyxl **does** now preserve images in files. – Charlie Clark Aug 26 '18 at 17:53
  • Charlie, can you clarify what you mean? As per the current documentation, "Only cells (including values, styles, hyperlinks and comments) and certain worksheet attribues (including dimensions, format and properties) are copied. All other workbook / worksheet attributes are not copied - e.g. Images, Charts." My tests confirm this, unless I am doing something wrong – LoveToCode Aug 29 '18 at 05:34
  • If the answer solved your problem, please consider [accepting](https://meta.stackexchange.com/questions/5234/how-does-accepting-an-answer-work) it. – ivan_pozdeev Oct 01 '18 at 00:20

1 Answers1

2

As per how to get the relative position of shapes within a worksheet , in Excel object model, you get the cell adjacent to an image by its .TopLeftCell property:

test pictures

import win32com.client
x=win32com.client.Dispatch("Excel.Application")
wb=x.Workbooks.Open("<path_to.xlsx>")
ws=wb.Sheets("Sheet1")
for i in ws.Shapes:
    print i.TopLeftCell.Address

prints:

$B$2
$B$5
$D$3
ivan_pozdeev
  • 33,874
  • 19
  • 107
  • 152