2

InputI am building and app for sql query genreation , i am able to generate it but receiving output in timestamp format , please help in converting it to to_date format. Here is my code:-

from pandas import *
table_name="ADI"
file_name=pandas.read_excel('supermarke.xlsx')    
def SQL_Insert(SOURCE, TARGET):
    sql_texts = []
    for index, row in SOURCE.iterrows():
        sql_texts.append(
            'INSERT INTO ' + TARGET + ' (' + str(', '.join(SOURCE.columns)) + ')   VALUES ' + str(tuple(row.values))+";")

    return ('\n'.join(sql_texts))
print(SQL_Insert(file_name, table_name)) 

Here is the output i am getting :-

INSERT INTO ADI (ID, Address, City, State, Country, Supermarket Name, Number of Employees, DATE)   VALUES (1, '3666 21st St', 'San Francisco', 'CA 94114', 'USA', 'Madeira', 8, Timestamp('2018-01-12 00:00:00'));
INSERT INTO ADI (ID, Address, City, State, Country, Supermarket Name, Number of Employees, DATE)   VALUES (2, '735 Dolores St', 'San Francisco', 'CA 94119', 'USA', 'Bready Shop', 15, Timestamp('2018-01-12 00:00:00'));
INSERT INTO ADI (ID, Address, City, State, Country, Supermarket Name, Number of Employees, DATE)   VALUES (3, '332 Hill St', 'San Francisco', 'California 94114', 'USA', 'Super River', 25, Timestamp('2018-01-12 00:00:00'));
INSERT INTO ADI (ID, Address, City, State, Country, Supermarket Name, Number of Employees, DATE)   VALUES (4, '3995 23rd St', 'San Francisco', 'CA 94114', 'USA', "Ben's Shop", 10, Timestamp('2018-01-12 00:00:00'));

Here what i am expecting :-

INSERT INTO ADI (ID, Address, City, State, Country, Supermarket Name, Number of Employees, DATE)   VALUES (1, '3666 21st St', 'San Francisco', 'CA 94114', 'USA', 'Madeira', 8,  TO_DATE('12/01/2018 00:00:00', 'MM/DD/YYYY HH24:MI:SS');
INSERT INTO ADI (ID, Address, City, State, Country, Supermarket Name, Number of Employees, DATE)   VALUES (2, '735 Dolores St', 'San Francisco', 'CA 94119', 'USA', 'Bready Shop', 15, TO_DATE('12/01/2018 00:00:00', 'MM/DD/YYYY HH24:MI:SS');
INSERT INTO ADI (ID, Address, City, State, Country, Supermarket Name, Number of Employees, DATE)   VALUES (3, '332 Hill St', 'San Francisco', 'California 94114', 'USA', 'Super River', 25,TO_DATE('12/01/2018 00:00:00', 'MM/DD/YYYY HH24:MI:SS');
INSERT INTO ADI (ID, Address, City, State, Country, Supermarket Name, Number of Employees, DATE)   VALUES (4, '3995 23rd St', 'San Francisco', 'CA 94114', 'USA', "Ben's Shop", 10, TO_DATE('12/01/2018 00:00:00', 'MM/DD/YYYY HH24:MI:SS');

Please help me out here

  • Quick fix: in your loop, first build the date string using [`strftime`](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.Series.dt.strftime.html), then include it into the string with `str(tuple(row.values[:-1])) + custom_date_str + ";"`. Bigger fix: I suggest not manually building SQL query strings, and using a library like [PyPika](https://pypika.readthedocs.io/en/latest/2_tutorial.html#date-time-and-intervals). – Peter Leimbigler Nov 24 '18 at 16:42
  • @PeterLeimbigler, please elaborate , i didn't get it .Thanks – aditya singh Nov 24 '18 at 16:50

1 Answers1

0

If you must manually stitch together SQL query strings, try this (untested since I don't have your data):

import pandas as pd
table_name = 'ADI'
df = pandas.read_excel('supermarke.xlsx')

def SQL_Insert(SOURCE, TARGET):
    sql_texts = []
    for index, row in SOURCE.iterrows():
        formatted_date = pd.to_datetime(row[-1]).strftime('%m/%d/%Y %H:%M:%S')
        date_str = 'TO_DATE(\'{}\', \'MM/DD/YYYY HH24:MI:SS\')'.format(formatted_date)

        sql_texts.append((
            'INSERT INTO ' + 
            TARGET + 
            ' (' + 
            str(', '.join(SOURCE.columns)) + ')   VALUES ' + 
            str(tuple(row.values[:-1])) + 
            date_str +
            ';'))

    return ('\n'.join(sql_texts))

print(SQL_Insert(df, table_name)) 

I recommend not manually building query strings. It's a solved problem. Further reading: How do I get a raw, compiled SQL query from a SQLAlchemy expression?

Peter Leimbigler
  • 10,775
  • 1
  • 23
  • 37
  • i am getting below error:- formatted_date = row[-1].datetime.strptime('%m/%d/%Y %H:%M:%S') AttributeError: 'str' object has no attribute 'datetime' – aditya singh Nov 24 '18 at 17:29
  • Try what I actually wrote, which is `formatted_date = row[-1].strftime('%m/%d/%Y %H:%M:%S')` (no `.datetime`) – Peter Leimbigler Nov 24 '18 at 20:34
  • I tried the same getting below error. formatted_date = row[-1].strftime('%m/%d/%Y %H:%M:%S') AttributeError: 'str' object has no attribute 'strftime' – aditya singh Nov 25 '18 at 12:10
  • @adityasingh, then you have to run `pd.to_datetime()` on your date column to convert it from string format. I've edited my answer, but without seeing your data, I can't be sure if it will work. – Peter Leimbigler Nov 25 '18 at 16:49
  • Still Receiving the error File "C:\Users\adiec\AppData\Local\Programs\Python\Python37-32\lib\site-packages\dateutil\parser\_parser.py", line 1356, in parse return DEFAULTPARSER.parse(timestr, **kwargs) File "C:\Users\adiec\AppData\Local\Programs\Python\Python37-32\lib\site-packages\dateutil\parser\_parser.py", line 648, in parse raise ValueError("Unknown string format:", timestr) ValueError: ('Unknown string format:', 'null') Editted the question added input data – aditya singh Nov 26 '18 at 16:40
  • @adityasingh, sounds like at least one of your date fields is blank in the Excel file. You'll have to add a `fillna()` step, or exception handling. I suggest opening a new question so it receives more attention from the community. – Peter Leimbigler Nov 26 '18 at 17:26