4

I have been able to read an Excel cell value with xlrd using column and row numbers as inputs. Now I need to access the same cell values in some spreadsheets that were saved in .ods format.

So for example, how would I read with Python the value stored in cell E10 in an .ods file?

Community
  • 1
  • 1
user329257
  • 43
  • 1
  • 3
  • Some hints and a little code are at http://solyaris.wordpress.com/2007/07/06/reading-open-document-with-python/ It looks like it shouldn't be too, too hard (once you hack your way through XML. – msw Apr 29 '10 at 22:13
  • 1
    @msw: *Looks* like it shouldn't be too hard ... but that blogger hasn't been heard from since (almost 3 years ago) ;-) – John Machin Apr 30 '10 at 00:13
  • Can you depend on openoffice being installed? The OOo API has python bindings. Although the API is a bit bewildering, using it is bound to be more reliable than hacking the xml. – intuited Jun 06 '10 at 07:52

2 Answers2

5

Hacking your way through the XML shouldn't be too hard ... but there are complications. Just one example: OOo in their wisdom decided not to write the cell address explicitly. There is no cell attribute like address="E10" or column="E"; you need to count rows and columns.

Five consecutive empty cells are represented by
<table:table-cell table:number-columns-repeated="5" />

The number-colums-repeated attribute defaults to "1" and also applies to non-empty cells.

It gets worse when you have merged cells; you get a covered-table-cell tag which is 90% the same as the table-cell tag, and attributes number-columns-spanned and number-rows-spanned need to be figured into column and row counting.

A table:table-row tag may have a number-rows-repeated attribute. This can be used to repeat the contents of a whole non-empty row, but is most often seen when there are more than 1 consecutive empty rows.

So, even if you would be satisfied with a "works on my data" approach, it's not trivial.

You may like to look at ODFpy. Note the second sentence: """Unlike other more convenient APIs, this one is essentially an abstraction layer just above the XML format.""" There is an ODF-to-HTML script which (if it is written for ODS as well as for ODT) may be hackable to get what you want.

If you prefer a "works on almost everybody's data and is supported and has an interface that you're familiar with" approach, you may need to wait until the functionality is put into xlrd ... but this isn't going to happen soon.

John Machin
  • 81,303
  • 11
  • 141
  • 189
  • Given that two supposedly working libraries (pyexcel-ods, odfpy) died on the first file that I tested and third (ezodf) may silently return bogus data I would prefer to avoid writing my own processing. It would sooner or later run into some arcane and mysterious part of .ods file format - like these existing solutions. – reducing activity May 15 '16 at 16:15
2

From libraries that I tried ezodf was the one that worked.

from ezodf import opendoc, Sheet
doc = opendoc('test.ods')
for sheet in doc.sheets:
   print sheet.name
   cell = sheet['E10']
   print cell.value
   print cell.value_type

pyexcel-ods crashed, odfpy crashed and in addition its documentation is either missing or horrible.

Given that supposedly working libraries died on the first file that I tested I would prefer to avoid writing my own processing as sooner or later it would either crash or what worse fail silently on some weirder situation.

EDIT: It gets worse. ezodf may silently return bogus data.

reducing activity
  • 1,985
  • 2
  • 36
  • 64
  • Thanks for bringing the serious issue with `ezodf` to my attention [here](http://stackoverflow.com/a/29497292/1391441). Your edit was rejected, but I updated the answer and added the info as a comment. I've personally have a great experience using `pyexcel-ods`. – Gabriel May 15 '16 at 13:45
  • 1
    @Gabriel And in case of this crash pyexcel-ods is innocent. Problem is caused by bug in LibreOffice (https://bugs.documentfoundation.org/show_bug.cgi?id=97631). Just my bad luck that I triggered some edge case. – reducing activity May 15 '16 at 16:11
  • 1
    thanks a lots after using ozodf, I realize odfpy was wost of all, if you try to open huge file it just hung for time eating lots of CPU. – rho Jul 18 '16 at 15:02