I am running the below code. Everything works except that after running the last line I get a "TypeError: unsupported operand type(s) for /: 'str' and 'int'" error.
The issue, it turns out, relates to a particular Excel cell issue that seems to be prevalent in more than one of the +80 files I am reading into Python. Sometimes some of the cells in the Excel files I am reading read appear to be blank in Excel but aren't empty (this can be checked in Excel via the ISBLANK() function). Reading and importing empty cells isn't an issue, but importing/reading empty cells that empty to be blank but aren't registered as so by Excel are a problem.
Opening the Excel file and manually selecting and deleting the cell (which appeared empty, but somehow register as non-blank in Excel) seems to do the trick. However, I don't want to open every Excel file and start selecting and deleting empty cells for reassurance. Note also that it wasn't that the Excel cell in question just contained a space (ie, ""), it didn't so the problem seems to be something else.
How can I deal with this from with Python/Pandas? I tried all of the solutions in this thread and nothing seemed to work.
import openpyxl
import os
import pandas as pd
import time
from openpyxl import load_workbook
os.chdir('C://Files//Research')
directory = os.listdir('C://Files//Research')
df = pd.DataFrame()
start = time.time()
for file in directory:
if os.path.isfile(file):
file_name = file
workbook = load_workbook(filename = file, data_only=True)
sheet1 = workbook['3. ISO']
c5 = sheet1['C5'].value
c6 = sheet1['C6'].value
sheet11 = workbook['4. Survey Overview']
c10 = sheet11['C10'].value
c11 = sheet11['C11'].value
df = df.append(pd.DataFrame({
"File_name":file, "ISO":c5, "Output":c6,
"Nat":c10, "Urb": c11
}, index=[0]))
end = time.time()
print(end - start)
df['Output %'] = 100* (df['Output'] / df['Nat'])