0

I've got the following data that I am retrieving from an Excel spreadsheet:

enter image description here

As you can see, there are rows that have floats and others that have ints. Using my below program, if I retrieve each value using worksheet.cell_value(row, col), then each int is being retrieved as a float. For example, 321 in the Random Number column is being retrieved as 321.0.

I do not want to have decimal points if the number does not include it. I've seen a couple of examples online, specifically this one, but I've not been successful with any of the answers.

So far, I've got:

import xlrd

workbook = xlrd.open_workbook("demo.xlsx")
worksheet = workbook.sheet_by_name('Sheet1')

num_rows = worksheet.nrows
num_cols = worksheet.ncols

for row in range(num_rows):
    for col in range(num_cols):
        if isinstance(worksheet.cell_value(row, col), float):
            print(worksheet.cell_value(row, col))
        elif isinstance(worksheet.cell_value(row, col), int):
            print(int(worksheet.cell_value(row, col)))
        else:
            print(worksheet.cell_value(row, col))

But this returns the the ints with decimal points:

Version
Name
Random Number
1.1
Smith
321.0
1.2
John
1234.0
2.1
Paul
123456.0
2.2
Rich
98765.0
2.3
Harvey
567.0
2.4
Jones
90909.0

Alternatively, I have tried:

import xlrd

workbook = xlrd.open_workbook("demo.xlsx")
worksheet = workbook.sheet_by_name('Sheet1')

num_rows = worksheet.nrows
num_cols = worksheet.ncols

for row in range(num_rows):
    for col in range(num_cols):
        if worksheet.cell_value(row, col) == int(worksheet.cell_value(row, col)):
            print(int(worksheet.cell_value(row, col)))
        else:
            print(worksheet.cell_value(row, col))

But I get an error:

ValueError: invalid literal for int() with base 10: 'Version'

Adam
  • 2,384
  • 7
  • 29
  • 66
  • `I do not want to have decimal points if the number does not include it.` - Other values in that column are actual floats, so I'd recommend keeping them like that (keeping whole column one type), and just adjust printing format. – h4z3 Mar 06 '20 at 12:07
  • As for the error - you also grab headers. You need to check that your value if float first: change the if line to `if isinstance(worksheet.cell_value(row, col), float) and worksheet.cell_value(row, col) == int(worksheet.cell_value(row, col))` – h4z3 Mar 06 '20 at 12:09
  • @h4z3 I was just thinking, and there isn't a use case for my project where a column would have mixed floats and ints. It's either one or the other. I've edited my question to reflect this. How would I adjust the printing format in that case? – Adam Mar 06 '20 at 12:11
  • I'm also not sure what you mean by your second answer. Why would I be checking if the value is an instance of float and also of int type? – Adam Mar 06 '20 at 12:13

1 Answers1

1

I was on the right track with the second code example, but I had to make sure that any string is being evaluated as well:

import xlrd

workbook = xlrd.open_workbook("demo.xlsx")
worksheet = workbook.sheet_by_name('Sheet1')

num_rows = worksheet.nrows
num_cols = worksheet.ncols

for row in range(num_rows):
    for col in range(num_cols):
        if isinstance(worksheet.cell_value(row, col), str):
            print(worksheet.cell_value(row, col))
        elif worksheet.cell_value(row, col) == int(worksheet.cell_value(row, col)):
            print(int(worksheet.cell_value(row, col)))
        else:
            print(worksheet.cell_value(row, col))

Result:

Version
Name
Random Number
1.1
Smith
321
1.2
John
1234
2.1
Paul
123456
2.2
Rich
98765
2.3
Harvey
567
2.4
Jones
90909
Adam
  • 2,384
  • 7
  • 29
  • 66