1

I've the following script:

now = dt.datetime.now()
date_filter = now - timedelta(days=3)
list_ids = [1,2,3]
dq_connection = mysql.connector.connect(user='user', password='pass', host='localhost', database='db')
engine = create_engine('localhost/db')
cursor = connection.cursor(buffered=True)
query = ('''
SELECT *
FROM (SELECT * FROM myTable1 WHERE id in {%s}
WHERE date >= %s;
''')
df = pd.read_sql_query(query, connection,params=(list_ids,date_filter,))

And I want to have two filters on my query: 1) List all the IDs that I've on list_ids 2) Filter only the dates before date_filter.

The second Filter I can do it, but when I try with the list I got:

pandas.io.sql.DatabaseError: Execution failed on sql

What I am doing wrong?

Pedro Alves
  • 1,004
  • 1
  • 21
  • 47

2 Answers2

2

Because IN clause receives multiple values, you need to adjust prepared statement with requisite number of placeholders, %s, and then unpack list for parameters with func(*list). Plus no subquery is needed for both WHERE clauses.

query = '''SELECT * FROM myTable1 
           WHERE id in (%s, %s, %s) AND date >= %s;
        '''

df = pd.read_sql_query(query, connection, params=(*list_ids, date_filter))

For dynamic placeholders equal to length of list, integrate a str.join:

placeholders = ", ".join(["%s" for _ in list_ids])

query = '''SELECT * FROM myTable1 
           WHERE id in ({}) AND date >= %s;
        '''.format(placeholders)

df = pd.read_sql_query(query, connection, params=(*list_ids, date_filter))
Parfait
  • 104,375
  • 17
  • 94
  • 125
  • What does the `*list_ids` do here? Why is it not just `list_ids`? – DaveRGP Apr 17 '20 at 09:45
  • 1
    @DaveRGP, as text in solution mentions, the [asterisk unpacks the list](https://stackoverflow.com/questions/3480184/unpack-a-list-in-python) so you can match each string within list to corresponding `%s` in SQL statement. Otherwise you will attempt to bind 2 objects (one being an iterable and not a scalar) for 4 params (expecting 4 scalars) and so database engine will err out. – Parfait Apr 17 '20 at 15:34
  • Very nice, one comment - you don't need to explicitly case the expression in the join as a list comprehension. For some syntactic sugar you can leave the expression as `placeholders = ", ".join("%s" for _ in list_ids)`. – Keiron Stoddart Mar 22 '21 at 21:27
  • @KeironStoddart, good point. I have since learned since this question, `join` allows generators. – Parfait Mar 22 '21 at 21:29
0

The accepted answer suggests creating a %s placeholder for every item in the list of ids. But that's not necessary.

Each of the Python MySQL db packages supports passing lists or tuples into parameters. It'll figure out how to "expand" the list into the appropriate SQL before sending it off to the database.

Here's a complete example with pymysql:

import pymysql
import pandas


with pymysql.connect(
    host='',
    user='',
    passwd='',
    db='',
) as conn:
    data = pandas.read_sql(
        sql="select * from my_table where id in %(my_ids)s", 
        con=conn, 
        params={'my_ids': [1, 2, 3,]},
    )

(As an aside, and it's been many years since the OP's question, and so I think this specific point isn't particularly useful to future readers, but I'm pretty sure the SQL error he received was more mundane. He's missing a closing parens at the end of this phrase: FROM (SELECT * FROM myTable1 WHERE id in {%s}, and shouldn't have curly braces around %s)

Adam Easterling
  • 2,266
  • 1
  • 21
  • 21