8

I wanted to import a file containing text, numbers and dates using xlrd on Python.

I tried something like:

if "/" in worksheet.cell_value:
    do_this
else:
    do_that  

But that was of no use as I latter discovered dates are stored as floats, not strings. To convert them to datetime type I did:

try:
    get_row = str(datetime.datetime(*xlrd.xldate_as_tuple(worksheet.cell_value(i, col - 1), workbook.datemode)))
except:
    get_row = unicode(worksheet.cell_value(i, col - 1))

I have an exception in place for when the cell contains text. Now i want to get the numbers as numbers and the dates as dates, because right now all numbers are converted to dates.

Any ideas?

dreftymac
  • 31,404
  • 26
  • 119
  • 182
Antoni4040
  • 2,297
  • 11
  • 44
  • 56

2 Answers2

10

I think you could make this much simpler by making more use of the tools available in xlrd:

cell_type = worksheet.cell_type(row - 1, i)
cell_value = worksheet.cell_value(row - 1, i)

if cell_type == xlrd.XL_CELL_DATE:
    # Returns a tuple.
    dt_tuple = xlrd.xldate_as_tuple(cell_value, workbook.datemode)
    # Create datetime object from this tuple.
    get_col = datetime.datetime(
        dt_tuple[0], dt_tuple[1], dt_tuple[2], 
        dt_tuple[3], dt_tuple[4], dt_tuple[5]
    )
elif cell_type == xlrd.XL_CELL_NUMBER:
    get_col = int(cell_value)
else:
    get_col = unicode(cell_value)
chewynougat
  • 1,099
  • 2
  • 11
  • 19
  • 7
    Two things. (1) You can unpack a tuple using the `*` operator, so `get_col = datetime.datetime(*dt_tuple)`. (2) As of xlrd 0.9.3, there is a function, `xldate.xldate_as_datetime`, which can be used instead of `xldate_as_tuple`. (To be fair, 0.9.3 is newer than this answer.) – John Y Aug 25 '14 at 21:53
8

Well, never mind, I found a solution and here it is!

try:
    cell = worksheet.cell(row - 1, i)
    if cell.ctype == xlrd.XL_CELL_DATE:
        date = datetime.datetime(1899, 12, 30)
        get_ = datetime.timedelta(int(worksheet.cell_value(row - 1, i)))
        get_col2 = str(date + get_)[:10]
        d = datetime.datetime.strptime(get_col2, '%Y-%m-%d')
        get_col = d.strftime('%d-%m-%Y')
    else:
        get_col = unicode(int(worksheet.cell_value(row - 1, i))) 
except:
    get_col = unicode(worksheet.cell_value(row - 1, i))

A bit of explanation: it turns out that with xlrd you can actually check the type of a cell and check if it's a date or not. Also, Excel seems to have a strange way to save daytimes. It saves them as floats (left part for days, right part for hours) and then it takes a specific date (1899, 12, 30, seems to work OK) and adds the days and hours from the float to create the date. So, to create the date that I wanted, I just added them them and kept only the 10 first letters ([:10]) to get rid of the hours(00.00.00 or something...). I also changed the order of days_months-years because in Greece we use a different order. Finally, this code also checks if it can convert a number to an integer(I don't want any floats to show at my program...) and if everything fails, it just uses the cell as it is(in cases there are strings in the cells...). I hope that you find that useful, I think there are other threads that say that this is impossible or something...

Antoni4040
  • 2,297
  • 11
  • 44
  • 56
  • 1
    Excel on Windows/Mac differ on the origin point - that's what the datemode attribute on the workbook is for - so you should be using that and not hard coding the datetime. It's also possible (although unlikely) that by truncating elements from the float (as that's how date/times are stored in Excel) that you shave off enough information that will put your date off by a day. What'd be closer to correct on this is your original question, but just add in the cell.ctype check and you're done... – Jon Clements Jul 25 '13 at 08:45
  • The hours contained in float are all 00.00.00, so it's nothing... And what I used in my original question was difficult to handle, maybe even buggy, I don't know why, so I couldn't change the order of the date(and that is essential...) – Antoni4040 Jul 25 '13 at 10:13
  • The formula is not correct on Windows too e.g., if `cell_value=30` your formula returns `29-01-1900` then it should return `30-01-1900`, see [How to convert a given ordinal number (from Excel) to a date](http://stackoverflow.com/q/29387137/4279). OR (better) use `xlrd` methods as shown in [@chewynougat' answer/comments](http://stackoverflow.com/a/18016189/4279). Also `date + get_` is already a datetime object: no need to convert it to str, only to convert it back to datetime, only to convert it to str in another format -- just convert it to a desired format directly using `.strftime()`. – jfs Nov 28 '15 at 05:58