1

I'm using Python 3.5 and win32com to iterate through an excel document rows (only from the first column) and get all the values from it, the best would be in a simple list. I tried the following implementation but there are a few problems with it:

excel_app = win32com.client.Dispatch('Excel.Application')
workbook = excel_app.Workbooks.Open(myfile2)
worksheet = workbook.Sheets('Sheet1')
data = excel_app.Range("A1:A60")
print(data)

for i in data:
    if i is not None:
        print(i)

The problems are mainly with: "data = excel_app.Range("A1:A60")"

  1. This takes the first 60 rows, I would like to get only as many rows as there is data in the excel file, otherwise either i don't get enough data, or i get too much and i end up with a bunch of 'None'

  2. It returns a Tuple, so i can't modify it in order to remove the 'None' values if i get more data than the spreadsheet has.

Dantuzzo
  • 271
  • 6
  • 25
  • 1
    Any specific reason you're using win32com? It's one of the harder ways to do it. – Alan May 18 '18 at 08:32
  • yes, I need win32com because it's the only library i know that lets me embed objects into excel if my condition is true (if i is not None), with the following code: Embedded_object.Add(ClassType=None, Filename=file_loction, Link=False, DisplayAsIcon=True, Left=3, Top=0, Width=50, Height=50) – Dantuzzo May 18 '18 at 08:41
  • Ok, understood... but if reading is so unconvenient and embedding is only possible with win32com - why not mix it? use win32com for everything which works and has no alternative and the simple reading issues can be solved with xlrd – SpghttCd May 18 '18 at 08:45

2 Answers2

4

You can get the row count by using:

worksheet.UsedRange.Rows.Count

where UsedRange represents all the cells that have contained a value at any time.

Once you have the data in your tuple, you can copy it into a list for processing - e.g. new_list = data[:].

If you want to copy into a new list and remove None values at the same time:

new_list = [ item for item in data if item is not None]
Alan
  • 2,914
  • 2
  • 14
  • 26
  • For some reason, this gives unexpected high counts at more complicated sheets, possibly because later rows used to contain data. Would it be possible to determine the `UsedRange` on current data presence in a certain column, say, `A`? – Mast Jul 30 '19 at 11:27
  • 1
    @Mast This is a well-known phenomena - see [this post](https://stackoverflow.com/questions/11169445/error-in-finding-last-used-cell-in-excel-with-vba) for a fairly comprehensive discussion. While it's VBA, the same principles apply. I would suggest that you post your specific requirements as a question - for example whether you have to use win32com like this original poster - and then you should get an answer – Alan Jul 31 '19 at 21:41
1

If you wouldn't mind using xlrd it would be much easier I think. This would be sth like

import xlrd

xlf = xlrd.open_workbook('/wherever/file/may/roam/asdf.xlsx')
xls = xlf.sheet_by_name('Sheet1')
data = xls.col_values(0)
SpghttCd
  • 10,510
  • 2
  • 20
  • 25