176

I have a Python list, say

l = [1,5,8]

I want to write a SQL query to get the data for all the elements of the list, say

select name from students where id = |IN THE LIST l|

How do I accomplish this?

informatik01
  • 16,038
  • 10
  • 74
  • 104
Mohit Ranka
  • 4,193
  • 12
  • 41
  • 41
  • l1 = ['a'], l2 = ['a', 'b'], How do I generate a statement like this, "SELECT * FROM table_name WHERE col_name IN ['a'];" , I've tried the answer and it doesn't work – David Wei Aug 15 '22 at 08:30
  • l = [1,5,8] l_str = str(l).replace( '[', '(' ).replace( ']', ')' ) – David Wei Aug 15 '22 at 09:21

16 Answers16

145

Answers so far have been templating the values into a plain SQL string. That's absolutely fine for integers, but if we wanted to do it for strings we get the escaping issue.

Here's a variant using a parameterised query that would work for both:

placeholder= '?' # For SQLite. See DBAPI paramstyle.
placeholders= ', '.join(placeholder for unused in l)
query= 'SELECT name FROM students WHERE id IN (%s)' % placeholders
cursor.execute(query, l)
Pi.Lilac
  • 135
  • 1
  • 7
bobince
  • 528,062
  • 107
  • 651
  • 834
  • 15
    `','.join(placeholder * len(l))` would be a bit shorter while still readable imho – ThiefMaster May 01 '13 at 20:50
  • 7
    @Thiefmaster: yes, would have to be `([placeholder]*len(l))` in the general case as the placeholder may be multiple characters. – bobince May 02 '13 at 00:28
  • 2
    @bobince in my case I have assigned a variable a='john' and b='snow' and stored it in a tuple with the name got=['a,b'] and did performed the same query. Doing this I got the erroy i.e. Type Error: not all arguments converted during string formatting. How am I supposed to solbe it – sigdelsanjog Jan 09 '15 at 19:55
  • 3
    Why do you want to join "by hand" rather than leave this work to the dbapi? the key is to use a tuple rather than a list... – Alessandro Dentella Jan 11 '17 at 09:24
  • @AlessandroDentella, sqlite doesn't support it. The [example](https://docs.python.org/2/library/sqlite3.html) explicitly adds the `?` in the INSERT clause, which is similar to the IN clause. – toto_tico Aug 31 '17 at 09:34
  • Yeah, passing a tuple as a parameter is something that isn't a standard DB-API feature, and for `%`-paramstyle this probably isn't doing what you think—it's actually calling `str` on the tuple and inserting the Python repr into the query string. That works when it's a tuple of at least two integers, all of which must be too small to get a `L` suffix. It fails for most other cases—and in particular fails for string literals in a way that makes it a SQL injection vulnerability. Don't do that! – bobince Sep 01 '17 at 19:10
  • I have a list of repetitive data that shall be passed as placeholders list but sql only executes as set – Pawankumar Dubey Mar 31 '18 at 17:44
  • 8
    Based on this answerm here is a single line solution for ***Python 3.6*** `cursor.execute(f'SELECT name FROM students WHERE id IN ({','.join('?' for _ in l)})', l)`. @bobince, you should also remark in your solution that using `?` is the safe way to go in terms of avoid SQL injection. There are a lot of answers here that are vulnerable, basically any that concatenates strings in python. – toto_tico May 24 '18 at 13:40
  • @roganjosh, agreed, and actually wrong: `cursor.execute(f"SELECT name FROM students WHERE id IN ({','.join('?' for _ in l)})", l)`. Unfortunately, there are not a lot of options if you want to avoid SQL injection, I would split the single line into multiple 2 or 3. In this case, a single line is convenient as an example of how to use Python `f-strings`. In the original answer, the 3rd line would look like this: `query= f'SELECT name FROM students WHERE id IN ({placeholders})'`. – toto_tico Oct 06 '18 at 10:30
  • @bobince Is it possible to elaborate more about the comment "all of which must be too small to get a L suffix. It fails for most other cases—and in particular fails for string literals in a way that makes it a SQL injection vulnerability"? – Lei Hao Aug 13 '19 at 06:39
  • @LeiHao In Python 2, a number with an L suffix, like `98L` is a long integer. If a number is too large to be a short integer, like `80000000000`, then `repr()` returns the value with an L suffux. Python 3 never does this because the integer type completely handles it Python 3. – Bacon Bits Feb 07 '20 at 13:46
  • 1
    what if I have to pass some extra parameter along with list `l` in the query. `query= 'SELECT name FROM students WHERE id IN (%s) and name= %(name)s' % placeholders` ?? – Rahul Reddy Apr 16 '20 at 22:22
  • 3
    Reading these comments and the other answers, I see why there have been so many SQL injection vulnerabilities. Almost no one seems to care (until they're pwned). – Paul Coccoli Feb 09 '22 at 20:41
  • For MySQL use %s instead of ? (https://dev.mysql.com/doc/connector-python/en/connector-python-api-mysqlcursor-execute.html). – yagus Apr 12 '22 at 21:06
  • @toto_tico Thank you very much for your comment, I don't see any instructions on https://linuxhint.com/cursor-execute-python/ to use the question mark as a placeholder, is this unique to cursor or common to python? Where is the description for this? – David Wei Aug 25 '22 at 03:15
  • @DavidWei, i am not an expert but this is resolved somewhere close to the database, otherwise it wouldn't avoid SQL injections. Basically, when the query is parsed and a ? is encountered, then the first value of the list is used (notice that it is not the sames as simply replacing the value because the DB parser is interpreting the ? in the right context). The documentation will depend on the library/database, e.g., https://docs.python.org/3/library/sqlite3.html#sqlite3-placeholders or https://dev.mysql.com/doc/connector-python/en/connector-python-api-mysqlcursorprepared.html – toto_tico Aug 26 '22 at 06:06
  • I encountered a bug when using it, the code is as follows: placeholder='?' region_id = '930' msisdn = ['188'] in_cond = ','.join('?' for _ in msisdn) query= f"DELETE FROM cloud_financial.tb_ads_timely_user_today_4h_{region_id} WHERE msisdn IN ({in_cond})" print (query) ''' DELETE FROM cloud_financial.tb_ads_timely_user_today_4h_930 WHERE msisdn IN (?) ''' engine_vertica. execute(query, msisdn) ''' TypeError: not all arguments converted during string formatting ''' – David Wei Jan 16 '23 at 08:15
  • @sigdelsanjog Has this problem been solved? I also encountered, but did not find a solution. – David Wei Jan 16 '23 at 08:52
135

Easiest way is to turn the list to tuple first

t = tuple(l)
query = "select name from studens where id IN {}".format(t)
Amir Imani
  • 3,118
  • 2
  • 22
  • 24
  • 36
    As stated by @renstrm, this doesn't work if l just contains one element, you'll end up with id IN (1,). Which is a syntax error. – Doubledown May 30 '19 at 02:28
  • if you have only one element, then you don't need to use `IN`. you can simply do `=` – Amir Imani Jul 30 '19 at 20:10
  • 1
    @Amirhos Imani and what if you don't know how many elements you have? – Boris Nov 22 '19 at 13:35
  • 1
    @Boris if you can't guarantee that your input is always a list, you can do something similar to this one liner https://stackoverflow.com/questions/38821586/one-line-to-check-if-string-or-list-then-convert-to-list-in-python before passing the argument to your query – Amir Imani Nov 26 '19 at 15:17
  • 34
    This, like most answers here other than the accepted one, is vulnerable to SQL injection and should not be used. – Bacon Bits Dec 29 '19 at 21:22
  • 3
    @BaconBits Fair warning, but for some applications where SQL injection is not an issue (for example, analysis of a database where I am the only user), this will do. – irene Feb 07 '20 at 08:13
  • 1
    yeah this doesnt work for one element, just realized – Sibish Apr 16 '20 at 23:50
  • 1
    In case of single value check length of list should be one then convert it to like this `'({0})'.format(str(list[0]))` – Anup Yadav Jan 05 '22 at 09:26
  • yeah, incase of single value it adds empty one, so i used this. `query = "DELETE FROM some_table WHERE id IN {ids}".format( ids='(' + str(your_arr[0]) + ')' if len(your_arr) == 1 else tuple(your_arr))` – iCurious Nov 17 '22 at 12:10
36

Dont complicate it, Solution for this is simple.

l = [1,5,8]

l = tuple(l)

params = {'l': l}

cursor.execute('SELECT * FROM table where id in %(l)s',params)

enter image description here

I hope this helped !!!

ALLSYED
  • 1,523
  • 17
  • 15
  • 27
    This doesn't work if `l` just contains one element, you'll end up with `id IN (1,)`. Which is a syntax error. – Peter Lithammer Nov 30 '16 at 09:12
  • If it is only one element , IN is not the operator you are looking for.OP clearly doesn;t. – ALLSYED Dec 23 '16 at 04:48
  • 5
    If l only contains 1 element be sure it's a tuple and it WILL work. This solution is clearer in my opinion than the accepted one. – Alessandro Dentella Jan 11 '17 at 09:20
  • 2
    But it still won't work if tuple is empty, so there must be additional check before query execution. – Никита Конин Apr 17 '17 at 12:22
  • Some database connectors (like pyodbc connecting to SQL Server) don't support named parameters, so this won't always work. – LexH Oct 23 '18 at 21:30
  • 2
    This doesn't work for me with `sqlite3`. What library did you test this against? – Nick Chammas Dec 16 '18 at 20:38
  • This is the best example as it has a simple string creation solution and also prevents sql injection. – DaveRGP Apr 23 '19 at 10:08
  • 1
    Good solution but @renstrm and Никита-Конин are right about requiring additional checks for when the tuple has a single element or no elements. – Anish Sana Jun 26 '19 at 17:19
  • 1
    This is the correct answer. I use redshift and pandas, there it's: `pd.read_sql_query(sql='select * from table where id in :ids_tuple', con=..., params={'ids_tuple':l});` Very convenient. and you can substitute the lists of ids in a cycle without recompilation of the query. – Alex Fedotov Jul 13 '20 at 21:13
32

The SQL you want is

select name from studens where id in (1, 5, 8)

If you want to construct this from the python you could use

l = [1, 5, 8]
sql_query = 'select name from studens where id in (' + ','.join(map(str, l)) + ')'

The map function will transform the list into a list of strings that can be glued together by commas using the str.join method.

Alternatively:

l = [1, 5, 8]
sql_query = 'select name from studens where id in (' + ','.join((str(n) for n in l)) + ')'

if you prefer generator expressions to the map function.

UPDATE: S. Lott mentions in the comments that the Python SQLite bindings don't support sequences. In that case, you might want

select name from studens where id = 1 or id = 5 or id = 8

Generated by

sql_query = 'select name from studens where ' + ' or '.join(('id = ' + str(n) for n in l))
Community
  • 1
  • 1
Blair Conrad
  • 233,004
  • 25
  • 132
  • 111
10

string.join the list values separated by commas, and use the format operator to form a query string.

myquery = "select name from studens where id in (%s)" % ",".join(map(str,mylist))

(Thanks, blair-conrad)

Community
  • 1
  • 1
gimel
  • 83,368
  • 10
  • 76
  • 104
  • 5
    Since this approach doesn't use database parameter substitution, it exposes you to SQL injection attacks. The `%` being used here is just plain Python string formatting. – Nick Chammas Dec 16 '18 at 20:27
10

I like bobince's answer:

placeholder= '?' # For SQLite. See DBAPI paramstyle.
placeholders= ', '.join(placeholder for unused in l)
query= 'SELECT name FROM students WHERE id IN (%s)' % placeholders
cursor.execute(query, l)

But I noticed this:

placeholders= ', '.join(placeholder for unused in l)

Can be replaced with:

placeholders= ', '.join(placeholder*len(l))

I find this more direct if less clever and less general. Here l is required to have a length (i.e. refer to an object that defines a __len__ method), which shouldn't be a problem. But placeholder must also be a single character. To support a multi-character placeholder use:

placeholders= ', '.join([placeholder]*len(l))
jimhark
  • 4,938
  • 2
  • 27
  • 28
5

If you're using PostgreSQL with the Psycopg2 library you can let its tuple adaption do all the escaping and string interpolation for you, e.g:

ids = [1,2,3]
cur.execute(
  "SELECT * FROM foo WHERE id IN %s",
  [tuple(ids)])

i.e. just make sure that you're passing the IN parameter as a tuple. if it's a list you can use the = ANY array syntax:

cur.execute(
  "SELECT * FROM foo WHERE id = ANY (%s)",
  [list(ids)])

note that these both will get turned into the same query plan so you should just use whichever is easier. e.g. if your list comes in a tuple use the former, if they're stored in a list use the latter.

Sam Mason
  • 15,216
  • 1
  • 41
  • 60
5

Just use inline if operation with tuple function:

query = "Select * from hr_employee WHERE id in " % tuple(employee_ids) if len(employee_ids) != 1 else "("+ str(employee_ids[0]) + ")"
Greed Ruler
  • 169
  • 2
  • 14
  • My problem was with parameterised queries. The injected list param, always ended up having its values surrounded by quotes in the resulting SQL `IN` clauses which was confusing as hell. The `tuple()` idea worked. It works because DJango translates that into a string format suited to an SQL `IN()` clause. It was this that didn't click for me when several answers suggested using the `tuple`. – theruss Jan 19 '21 at 23:16
4

Solution for @umounted answer, because that broke with a one-element tuple, since (1,) is not valid SQL.:

>>> random_ids = [1234,123,54,56,57,58,78,91]
>>> cursor.execute("create table test (id)")
>>> for item in random_ids:
    cursor.execute("insert into test values (%d)" % item)
>>> sublist = [56,57,58]
>>> cursor.execute("select id from test where id in %s" % str(tuple(sublist)).replace(',)',')'))
>>> a = cursor.fetchall()
>>> a
[(56,), (57,), (58,)]

Other solution for sql string:

cursor.execute("select id from test where id in (%s)" % ('"'+'", "'.join(l)+'"'))
Ximix
  • 421
  • 3
  • 10
4

To run a select from where field is in list of strings (instead of int), as per this question use repr(tuple(map(str, l))). Full example:

l = ['a','b','c']
sql = f'''

select name 
from students 
where id in {repr(tuple(map(str, l)))}
'''
print(sql)

Returns: select name from students where id in ('a', 'b', 'c')

For a list of dates in Oracle, this worked

l = ['2020-11-24', '2020-12-28']
dates_str = ','.join([f'DATE {repr(s)}' for s in l])
dates_str = f'({dates_str})'

sql_cmd = f'''
select *
from students 
where 
and date in {dates_str}
'''

Returns: select * from students where and date in (DATE '2020-11-24',DATE '2020-12-28')

If you need to get the list of dates from a pandas df, it's df['date'].dt.strftime('%Y-%m-%d').unique()

And since I often needed it too, adding columns from a list

# single list
f'select {','.join(l)}'

# multi list in different tables
sql_cmd = f'''
select {','.join(f't1.{s}' for s in l1)},
{','.join(f't1.{s}' for s in l2)},
{','.join(f't2.{s}' for s in l3)}  
'''
citynorman
  • 4,918
  • 3
  • 38
  • 39
3
l = [1] # or [1,2,3]

query = "SELECT * FROM table WHERE id IN :l"
params = {'l' : tuple(l)}
cursor.execute(query, params)

The :var notation seems simpler. (Python 3.7)

user13476428
  • 71
  • 1
  • 1
  • 3
2
placeholders= ', '.join("'{"+str(i)+"}'" for i in range(len(l)))
query="select name from students where id (%s)"%placeholders
query=query.format(*l)
cursor.execute(query)

This should solve your problem.

2

a simpler solution:

lst = [1,2,3,a,b,c]

query = f"""SELECT * FROM table WHERE IN {str(lst)[1:-1}"""
Omar Omeiri
  • 1,506
  • 1
  • 17
  • 33
1

For example, if you want the sql query:

select name from studens where id in (1, 5, 8)

What about:

my_list = [1, 5, 8]
cur.execute("select name from studens where id in %s" % repr(my_list).replace('[','(').replace(']',')') )
pgalilea
  • 261
  • 2
  • 5
1

This uses parameter substitution and takes care of the single value list case:

l = [1,5,8]

get_operator = lambda x: '=' if len(x) == 1 else 'IN'
get_value = lambda x: int(x[0]) if len(x) == 1 else x

query = 'SELECT * FROM table where id ' + get_operator(l) + ' %s'

cursor.execute(query, (get_value(l),))
1

This Will Work If Number of Values in List equals to 1 or greater than 1

t = str(tuple(l))
if t[-2] == ',':
   t= t.replace(t[-2],"")
query = "select name from studens where id IN {}".format(t)
raj
  • 186
  • 1
  • 9