1

I understand this has been asked a few times, but I can't quite figure out my issue even though I have tried many of the given answers. I am trying to read an Excel cell that is represented as a Date, but when it is imported into my program with Python, it is represented as a float.

The fourth column in my excel spreadsheet (i.e. index=3) contains all the dates that I am trying to convert from float to string. I am simply adding all the values to a list to do something with the entire row, before clearing it and moving on to the next row.

I have based my code below from this answer: https://stackoverflow.com/a/13962976/3480297

This is the sample data that I am working with (file):

enter image description here

import xlrd

workbook = xlrd.open_workbook(file)
worksheet = workbook.sheet_by_name('Sheet1')

num_rows = worksheet.nrows
num_cols = worksheet.ncols
values = []

i = 0
j = 0

while i < num_rows:
    while j < num_cols:
        if j == 3:
            values.append(xlrd.xldate_as_tuple(str(worksheet.cell_value(i, j)), 0))
        else:
            values.append(worksheet.cell_value(i, j))
        j += 1
    j = 0
    i += 1
    values = []

However, my above code gives me the below error at print(xlrd.xldate_as_tuple(str(worksheet.cell_value(i, j)), 0)) and I cannot figure out why.

TypeError: '<' not supported between instances of 'str' and 'float'

EDIT:

After debugging, the code is throwing an error where worksheet.cell_value(i, j) = 43588.0

EDIT2:

I removed the str(...) cast around the Date to include: values.append(xlrd.xldate_as_tuple(worksheet.cell_value(i, j)), workbook.datemode) but this is throwing an error:

TypeError: xldate_as_tuple() missing 1 required positional argument: 'datemode'

Adam
  • 2,384
  • 7
  • 29
  • 66
  • 3
    Try `print(worksheet.cell_value(i, j))` in order to help you debug, just to make sure you're not off by one column. Also you should use `for` loops here instead of `while` and your code will be a lot more readable if you replace `i` with `row` and `j` with `column`. – Dan Mar 03 '20 at 12:01
  • 1
    Try removing the str call, i.e. `print(xlrd.xldate_as_tuple(worksheet.cell_value(i, j), 0))` – Foxocube Mar 03 '20 at 12:01
  • @PedroLobito thanks for spotting that. I missed that out in my copy and paste. I've included it – Adam Mar 03 '20 at 12:03
  • i know thats not the question. but maybe you want to use pandas, which would probably make the problem much easier – luigigi Mar 03 '20 at 12:04
  • @Dan I've included print(...) as per your suggestion but I still don't see a reason why this error would be thrown. – Adam Mar 03 '20 at 12:04
  • Either `num_rows` or `num_cols` is a `str`. Cast it to an `int`. `int(worksheet.nrows)`... – Pedro Lobito Mar 03 '20 at 12:04
  • @PedroLobito I still get the same error after casting. – Adam Mar 03 '20 at 12:05
  • What's the value of `num_rows` and `num_cols` ? – Pedro Lobito Mar 03 '20 at 12:05
  • Add to your question, the output of `print(f"row {i} col {j}: {worksheet.cell_value(i, j)}")` for the value of `i` that you are getting the error on (and all the `j`s) and also, if it's not for the first row then also the output for the first row so we can see one that work properly. – Dan Mar 03 '20 at 12:06
  • @PedroLobito 9 and 6 respectively. – Adam Mar 03 '20 at 12:08
  • *"the code is throwing an error where worksheet.cell_value(i, j) = 43588.0"* for what values of `i` and `j`? – Dan Mar 03 '20 at 12:08
  • @Dan I've edited my question to include where the call to xldate_as_tuple is failing. It's doing so for a value of worksheet.cell_value(i, j) = 43588.0 and for i=1 and j=3 – Adam Mar 03 '20 at 12:08
  • why are you casting the date to a `str`? I imagine that `xldate_as_tuple` expects a float input... what happens when you try `xlrd.xldate_as_tuple(worksheet.cell_value(i, j), 0)`? – Dan Mar 03 '20 at 12:09
  • @Dan good point. I may have copied from a different answer incorrectly. I've modified it and edited my question to include a different error. – Adam Mar 03 '20 at 12:12
  • it would be so much easier if you would provide input data to reproduce your problem – luigigi Mar 03 '20 at 12:13
  • @luigigi I have done so as per your suggestion. – Adam Mar 03 '20 at 12:16
  • `values.append(xlrd.xldate_as_tuple(worksheet.cell_value(i, j)), workbook.datemode)` your close bracket is in the wrong place. Should be `values.append(xlrd.xldate_as_tuple(worksheet.cell_value(i, j), workbook.datemode))` – Dan Mar 03 '20 at 12:17
  • Does this answer your question? [How do I read a date in Excel format in Python?](https://stackoverflow.com/questions/1108428/how-do-i-read-a-date-in-excel-format-in-python) – Pedram Mar 03 '20 at 12:29

2 Answers2

2

I don't think you were supposed to cast the date as a string. Note I made a couple of other changes:

  • used for loops instead of while loops
  • please don't use single letter var names (like i, and j)
  • xldate_as_datetime makes more sense than xldate_as_tuple

note if you want the date displayed in a specific format try strftime

import xlrd

workbook = xlrd.open_workbook(file)
worksheet = workbook.sheet_by_name('Sheet1')

num_rows = worksheet.nrows
num_cols = worksheet.ncols
values = []

for row in range(1, num_rows):
    row_values = []
    for col in range(num_cols):
        if col == 3:
            datetime_ = xldate.xldate_as_datetime(worksheet.cell_value(row, col), workbook.datemode)
            row_values.append(datetime_)
        else:
            row_values.append(worksheet.cell_value(row, col))
    values.append(row_values)
Dan
  • 45,079
  • 17
  • 88
  • 157
  • Thank you Dan! The problem was with casting the date to string and wrong parentheses location. Is there a way to make the date into dd/mm/yyyy format rather than (yyyy, mm, dd, hh, mm, ss) as it currently is? For example, one of the outputs is (2019, 9, 10, 0, 0, 0) – Adam Mar 03 '20 at 12:21
  • Yes, the answer is in the link you posted i.e. `datetime.datetime(*` will convert from the tuple to a datetime object. If you then want a string with your format use [`strftime`](https://docs.python.org/3/library/datetime.html#datetime.date.strftime). I'll edit to parse the tuple to a datetime object, but I'll leave converting to a string to you. – Dan Mar 03 '20 at 12:24
  • Just saw this: stackoverflow.com/a/28220798/1011724 it makes much more sense. I'm going to edit again. Please test it and let me know if there's a mistake. – Dan Mar 03 '20 at 12:34
  • Ok will do! I'll test that out in just a bit. Thanks again! – Adam Mar 03 '20 at 12:35
  • Is there a reason that you've included the first for from the rows to start from 1 rather than from 0? I've tested everything out successfully but started the first loop from 0. – Adam Mar 03 '20 at 14:00
  • I assumed that row `0` was the header row, because your error was only appearing on row `i == 1`. Does this work better if you start the rows from 0? – Dan Mar 03 '20 at 14:01
  • 1
    You're right! What I didn't put (and therefore didn't mention) was that I'm doing some extra stuff with each entire row (and therefore the header one as well) which I'm disregarding in some calculations by doing "if row == 0". So long story short, you're answer still stands as correct. – Adam Mar 03 '20 at 14:54
1

i'm highly recommend to use pandas:

import pandas as pd
def read_db():
    data_list = []
    try:
        
        df = pd.read_excel(REVIEW_DB_URL, sheet_name='Sheet1')  # open the db
        for i in range(len(df)):
            data_list.append(df['name'][i], df['version'][i], df['datetime'][i],
            df['notes'][i])
            
        return data_list

    except Exception as e:
        print(e)
Tamil Selvan S
  • 562
  • 8
  • 25
yehezkel horoviz
  • 198
  • 3
  • 10
  • `except Exception` don't catch raw Exceptions. Also, if you're using pandas, then might as well just leave it as a df, no need to convert to a list. – Dan Mar 03 '20 at 12:22
  • 1
    looks good, i saw you're using datatime module, pay attention to other dated you are working with , read about the functionality of the module, it will keep you out of trouble. – yehezkel horoviz Mar 03 '20 at 12:33