0

I am new to Python and I am importing Excel data into postgreSQL. I have blank whitespaces in my Excel for the columns for Traffic and Week_Ending. The Try statement seems to function fine for the Week_Ending but for the Traffic it throws out an error. I checked the Excel and the error is showing up due to the single blank whitespace in one of the cells. I thought it would work for both the columns but it doesn't for the Traffic. Can anyone provide assistance please.

import psycopg2
import xlrd
import datetime
book = xlrd.open_workbook("T:\DataDump\8888.xlsx")
sheet = book.sheet_by_name("Builder_Traffic")
database = psycopg2.connect (database = "***", user="*")

cursor = database.cursor()
delete = """Drop table if exists "Python".buildertraffic"""
print (delete)
mydata = cursor.execute(delete) 




cursor.execute('''CREATE TABLE "Python".buildertraffic
   (Builder_Name varchar(55),   
Traffic integer,
Week_Ending date,
Project_ID integer
   );''')

print "Table created successfully"

query = """INSERT INTO "Python".buildertraffic (Builder_Name, Traffic,  Week_Ending, Project_ID)
VALUES (%s, %s, %s, %s)"""

for r in range(1, sheet.nrows):
    Builder_Name = sheet.cell(r,0).value
    Traffic = None
    try:
        Traffic = (sheet.cell(r,1).value)
    except:
        pass
    Week_Ending = None
    try:
        Week_Ending = xlrd.xldate.xldate_as_datetime(sheet.cell(r,2).value,book.datemode)
    except:
        pass
    Project_ID = sheet.cell(r,3).value

values = (Builder_Name, Traffic, Week_Ending, Project_ID)
cursor.execute(query, values)

cursor.close()
database.commit()
database.close()

print ""
print "All Done! Bye, for now."
print ""
columns = str(sheet.ncols)
rows = str(sheet.nrows)
print "I just imported Excel into postgreSQL" 

And the error shows up as :

Traceback (most recent call last):
File   "C:\Users\aqureshi\Desktop\Programming\PythonSQLNew\BuilderTraffic.py", line 47, in <module>
DataError: invalid input syntax for integer: " "
LINE 2: VALUES ('American Legend Homes', ' ', NULL, 2.0)
                                     ^
Jason Smith
  • 127
  • 1
  • 3
  • 14
  • You haven't shown us your whole program. You showed us how you get the values out of the spreadsheet, but not the (important!) part where those values are inserted into the database. – John Gordon Oct 07 '16 at 20:39

1 Answers1

0

You should be checking for a blank or empty string in that column:

traffic = sheet.cell(r, 1).value

I don't see in your code how you're actually executing the query but you should wrap traffic in int to parse an integer from a string: How to convert strings into integers in Python?

try:
    traffic = int(sheet.cell(r, 1).value)
except:
    traffic = -1 # or 0? some value to indicate this has NO traffic.
Community
  • 1
  • 1
  • 1
    Thanks omouse I have edited my original post. That's the original and the full code I have used for other tables and it has worked fine. – Jason Smith Oct 07 '16 at 20:51
  • Always assume the worst when parsing Excel files. Test for empty strings, spurious white space, invalid formats. If you can imagine it, you'll find it. If you can't imagine it, you'll find it. :-) – intrepidhero Oct 07 '16 at 20:52
  • @PLearner yep then your problem is that you need a good default value for Traffic and the other values when the parsing of them fails, something like `Traffic = 0` in the `except` block should work. –  Oct 07 '16 at 20:53
  • 1
    @Omouse is there anyway I can tell Python to leave it blank like it is for Week_Ending column. If I write 0 than the data might get intermingled with the real integer values in the column. – Jason Smith Oct 07 '16 at 20:56
  • @PLearner you're going to need to use a different SQL query when inserting that doesn't include the column. And allow the Traffic column to be null. –  Oct 07 '16 at 20:56
  • 1
    @Omouse please elaborate further. I don't understand please. – Jason Smith Oct 07 '16 at 20:58