0

I have an sqlite database and there's a table called cached_dates which simply has a row for each date, where the date is type text and looks like YYYY-MM-DD.

I also have a list of dates in the same string format and I want to know which dates in that last are not in the cached_dates table.

I can't figure out how to encode the string in a way that I can pass it into the pandas read_sql function using parameter substitution.

I can do it using string formatting but I know that's susceptible to injection attacks so I'd like to do it properly but I can figure out the syntax. Here's what I'm doing now:

Create the table:

CREATE TABLE IF NOT EXISTS cached_dates (date text NOT NULL PRIMARY KEY);

Insert something into the table:

insert into cached_dates values('2018-01-03');

This is the query I want to make:

SELECT *
FROM
(
    VALUES('2018-01-01'),('2018-01-02'),('2018-01-03'),('2018-01-04'),('2018-01-05')
)
EXCEPT
SELECT date FROM cached_dates;

This is the python code I'm using:

date_list = ['2018-01-01','2018-01-02','2018-01-03','2018-01-04','2018-01-05']

# Convert to string that can be passed to VALUES
dates_str = ''
for date in date_list:
    dates_str = dates_str + "('{}'),".format(date)
dates_str = dates_str[:-1]

sql = ( 'SELECT * '
        'FROM ' 
        '('
        'VALUES {}'
        ') '
        'EXCEPT '
        'SELECT date FROM cached_dates;').format(dates_str)

df = pd.read_sql(sql, self._conn)

I'd like to use the params parameter of read_sql but I couldn't get that to work. Can someone please show me how?

Thanks!

Brett Elliot
  • 922
  • 1
  • 6
  • 9

1 Answers1

0

you can try this

sql = ( 'SELECT * '
        'FROM ' 
        '('
        'VALUES %(date_str)s'
        ') '
        'EXCEPT '
        'SELECT date FROM cached_dates;')

df = pd.read_sql(sql, self._conn, params = {"date_str" : dates_str })

You can take reference from here

Rajat Mishra
  • 3,635
  • 4
  • 27
  • 41
  • Hmmm.. That gives me an error too... `Execution failed on sql 'SELECT * FROM (VALUES %(date_str)s) EXCEPT SELECT date FROM cached_dates;': near "%": syntax error`. – Brett Elliot Oct 06 '18 at 17:46