4

I need to query a table with an "in" clause, where the SQL looks like this:

select * from some_table where some_field in (?)

I originally took a naive approach and tried this:

in_items = [1,2,3,4]
df = pd.read_sql(MY_SQL_STATEMENT, con=con, params=[in_items]

Which did not work, it throws the following error:

The SQL contains 1 parameter markers, but 4 parameters were supplied

Where I'm stuck at, is figuring out how to pass a list of items as a single parameter.

I can do a string concatenation approach, something like:

MY_SQL = 'select * from tableA where fieldA in ({})'.format(
  ','.join([str(x) from x in list_items]))
df = pd.read_sql(MY_SQL, con=con)

I would rather avoid this approach if possible. Does anybody know of a way to pass a list of values as a single parameter?

I'm also open to a possibly more cleverer way to do this. :)

Gord Thompson
  • 116,920
  • 32
  • 215
  • 418
Ojingo
  • 202
  • 2
  • 9
  • 1
    Will [this](https://stackoverflow.com/a/28735827/6361531) work? – Scott Boston Dec 18 '18 at 18:59
  • @IgorRaush: huh... how does removing 175 characters from my post make my problem clearer? Feels like you just made this question more opaque. – Ojingo Dec 18 '18 at 19:26
  • @ScottBoston: thanks, let me check that out. – Ojingo Dec 18 '18 at 19:27
  • @Ojingo, I removed parts of the question I found redundant. Openings like "Help!" also don't add much in the way of content. Your question had all the necessary information, but terseness is always good. If you feel like some meaning is lost, feel free to revert all or part of my edit. – Igor Raush Dec 18 '18 at 20:05
  • @IgorRaush: gotcha. I didn't realize I could revert. I kept some of your edits, and added back a bit of my original. – Ojingo Dec 18 '18 at 20:26
  • Which MySQL client implementation are you using? – Igor Raush Dec 18 '18 at 20:43
  • @IgorRaush: sorry, it's actually sql-server. – Ojingo Dec 19 '18 at 21:02

3 Answers3

7

Simply string format the placeholders then pass in your params into pandas.read_sql. Do note, placeholder markers depend on DB-API: pyodbc/sqlite3 uses qmarks ? and most others use %s. Below assumes the former marker:

in_items = [1,2,3,4]
MY_SQL = 'select * from tableA where fieldA in ({})'\
           .format(', '.join(['?' for _ in in_items]))
# select * from tableA where fieldA in (?, ?, ?, ?)

df = pd.read_sql(MY_SQL, con=con, params=[in_items])
Parfait
  • 104,375
  • 17
  • 94
  • 125
  • I think this won't work if the parameters are numbers. String concatenation with join only works on string. – gouravkr May 13 '20 at 09:35
  • @gouravkr, there is no string concatenation here. This is a parameterization solution. The `format` + `join` binds param placeholders, `?`, to the prepared SQL statement which is later binded with values in `read_sql` line. This solution should work for *any* data type. – Parfait May 13 '20 at 14:48
  • Got it. I thought the `', '.join()` part was supposed to join the items in `in_items` into a single string. I tried running this now and was able to understand what you've done. I just needed to replace `'?'` with `'%s'` because I was using PostgreSQL. – gouravkr May 16 '20 at 09:29
0

For me, using sqllite3, worked this way:

list_of_entries_to_retrive = pd.read_excel('.table_with_entries.xlsx')
list_of_entries_to_retrive = (cell_list['entries']).tolist()

conn = sqlite3.connect('DataBase.db')

queryString = 'SELECT * FROM table WHERE attribute IN (\'{}\');'.format('\',\''.join([_ for _ in list_of_entries_to_retrive]))
df = pd.read_sql(queryString, con=conn)

Do not worked this way:

df = pd.read_sql(queryString, con=conn,  params=[list_of_entries_to_retrive]))

Thanks

0

Now that pandas is more insistent on using a SQLAlchemy Connectable you may want to consider using a SQLAlchemy Core query:

import pandas as pd
import sqlalchemy as sa

engine = sa.create_engine("mssql+pyodbc://scott:tiger^5HHH@mssql_199")

# set up example
with engine.begin() as conn:
    conn.exec_driver_sql("DROP TABLE IF EXISTS some_table")
    conn.exec_driver_sql("CREATE TABLE some_table (some_field int)")
    conn.exec_driver_sql(
        "INSERT INTO some_table (some_field) VALUES (1), (2), (3)"
    )

some_table = sa.Table("some_table", sa.MetaData(), autoload_with=engine)

in_items = [1, 3]
qry = sa.select(some_table).where(some_table.c.some_field.in_(in_items))

engine.echo = True
df = pd.read_sql_query(qry, engine)
"""SQL emitted:
SELECT some_table.some_field 
FROM some_table 
WHERE some_table.some_field IN (?, ?)
[generated in 0.00065s] (1, 3)
"""
Gord Thompson
  • 116,920
  • 32
  • 215
  • 418