I have an sqlite database and there's a table called cached_dates
which simply has a row for each date, where the date is type text
and looks like YYYY-MM-DD
.
I also have a list of dates in the same string format and I want to know which dates in that last are not in the cached_dates
table.
I can't figure out how to encode the string in a way that I can pass it into the pandas read_sql
function using parameter substitution.
I can do it using string formatting but I know that's susceptible to injection attacks so I'd like to do it properly but I can figure out the syntax. Here's what I'm doing now:
Create the table:
CREATE TABLE IF NOT EXISTS cached_dates (date text NOT NULL PRIMARY KEY);
Insert something into the table:
insert into cached_dates values('2018-01-03');
This is the query I want to make:
SELECT *
FROM
(
VALUES('2018-01-01'),('2018-01-02'),('2018-01-03'),('2018-01-04'),('2018-01-05')
)
EXCEPT
SELECT date FROM cached_dates;
This is the python code I'm using:
date_list = ['2018-01-01','2018-01-02','2018-01-03','2018-01-04','2018-01-05']
# Convert to string that can be passed to VALUES
dates_str = ''
for date in date_list:
dates_str = dates_str + "('{}'),".format(date)
dates_str = dates_str[:-1]
sql = ( 'SELECT * '
'FROM '
'('
'VALUES {}'
') '
'EXCEPT '
'SELECT date FROM cached_dates;').format(dates_str)
df = pd.read_sql(sql, self._conn)
I'd like to use the params
parameter of read_sql
but I couldn't get that to work. Can someone please show me how?
Thanks!