I am trying to dump the data in excel (xlsx) file into a text file using xlrd module and having problems with float precessions.
Have found a couple of post with similar issue where precession was lost some where in the last of 16 decimal digits .
Below is the data copied from xlsx:
VALUE;DATA
1.01 HELLO
2.11 1/1/2014
3.21 ONE
4.31 1/1/2014 12:14
5.441 $10
6.241 TWO
77.11 Zulfi
8.11 99
9.11 99.999
10.11 0
Below is what I get from xlrd:
1.01|'HELLO'
2.1099999999999999|'2014-01-01 00:00:00.000000'
3.21|'ONE'
4.3099999999999996|'2014-01-01 12:14:00.000000'
5.4409999999999998|10.0
6.2409999999999997|'TWO'
77.109999999999999|'Zulfi'
8.1099999999999994|99.0
9.1099999999999994|99.998999999999995
10.109999999999999|0.0
- "2.11" has become "2.1099999999999999" and "4.31" has become "4.3099999999999996" etc...
- "$10" has become "10"
- Date values look little different (this one I understand and I get what I coded)
I do understand some mysteries of floats where there is a possibility of loosing precession for values which have too many digits post the decimal but here I have only two digits.
I am testing the output by comparing with that of opensource ETL tool "Pentaho" (written in java) which can read/write excel files and this tool seems to have no problem reading the values as they appear in the xlsx file (the fields were being read as both strings and numbers with length 30 and precession 20)
Here is what Pentaho reads
VALUE;DATA
1.01;HELLO
2.11;2014/01/01 00:00:00.000
3.21;ONE
4.31;2014/01/01 12:14:00.000
5.441; 10.0
6.241;TWO
77.11;Zulfi
8.11; 99.0
9.11; 99.999
10.11; 0.0
and below is my python code :
for rownum in xrange(sh.nrows):
for colnum in xrange(sh.ncols):
cell_obj = sh.cell(rownum,colnum)
cell_val=sh.cell_value(rownum,colnum)
if cell_obj.ctype == xlrd.XL_CELL_DATE:
year, month, day, hour, minute, second = xlrd.xldate_as_tuple(cell_val, wb.datemode)
py_date = datetime.datetime(year, month, day, hour, minute, second).strftime("%Y-%m-%d %H:%M:%S.%f")
cell_val = py_date
if (colnum==0):
row_values=repr(cell_val)
else :
row_values=row_values+fdel+repr(cell_val)
if (((row_values).find("\\n"))>-1):
NLFlag=1
file_output.write((row_values).replace('\\n','') + "\n")
row_values=''
file_output.close()
Any help on this is highly appreciated.
Thanks