64

I am trying to use parameter substitution with SQLite within Python for an IN clause. Here is a complete running example that demonstrates:

import sqlite3

c = sqlite3.connect(":memory:")
c.execute('CREATE TABLE distro (id INTEGER PRIMARY KEY AUTOINCREMENT, name TEXT)')

for name in 'Ubuntu Fedora Puppy DSL SuSE'.split():
  c.execute('INSERT INTO distro (name) VALUES (?)', [ name ] )

desired_ids = ["1", "2", "5", "47"]
result_set = c.execute('SELECT * FROM distro WHERE id IN (%s)' % (", ".join(desired_ids)), ())
for result in result_set:
  print result

It prints out:

(1, u'Ubuntu') (2, u'Fedora') (5, u'SuSE')

As the docs state that "[y]ou shouldn’t assemble your query using Python’s string operations because doing so is insecure; it makes your program vulnerable to an SQL injection attack," I am hoping to use parameter substitution.

When I try:

result_set = c.execute('SELECT * FROM distro WHERE id IN (?)', [ (", ".join(desired_ids)) ])

I get an empty result set, and when I try:

result_set = c.execute('SELECT * FROM distro WHERE id IN (?)', [ desired_ids ] )

I get:

InterfaceError: Error binding parameter 0 - probably unsupported type.

While I hope that any answer to this simplified problem will work, I would like to point out that the actual query I want to perform is in a doubly-nested subquery. To wit:

UPDATE dir_x_user SET user_revision = user_attempted_revision 
WHERE user_id IN 
    (SELECT user_id FROM 
        (SELECT user_id, MAX(revision) FROM users WHERE obfuscated_name IN 
            ("Argl883", "Manf496", "Mook657") GROUP BY user_id
        ) 
    )
Clinton Blackmore
  • 2,427
  • 2
  • 24
  • 31
  • Thanks for all the answers. It made a lot of sense when I finally saw that I just needed a question mark for every parameter I am substituting. – Clinton Blackmore Aug 21 '09 at 14:23
  • you can use `notanorm` from pypi, it does this for you: `desired_ids = ["1", "2", "5", "47"] result_set = c.select('distro', id=desired_ids)` – Erik Aronesty Apr 25 '22 at 20:12

6 Answers6

98

You do need the right number of ?s, but that doesn't pose a sql injection risk:

>>> result_set = c.execute('SELECT * FROM distro WHERE id IN (%s)' %
                           ','.join('?'*len(desired_ids)), desired_ids)
>>> print result_set.fetchall()
[(1, u'Ubuntu'), (2, u'Fedora'), (5, u'SuSE')]
Alex Martelli
  • 854,459
  • 170
  • 1,222
  • 1,395
  • +1 for the "best" solution to generate the placeholder-list string :-) – Ferdinand Beyer Aug 21 '09 at 06:39
  • 3
    Is there an easy way to make this use named parameters instead? Something like `:id1 :id2 :id3` etc. I'm using this in the context of a larger query with a few other named parameters. – User Dec 10 '12 at 22:08
  • 12
    I'm coming to this several years later, but I needed named parameters, too. I just did this: `query = "SELECT * FROM my_table WHERE my_param = :my_param AND id IN ({})".format(', '.join(':{}'.format(i) for i in range(len(desired_ids)))) ; params = {'my_param': 'foo'} ; params.update({str(i): id for i, id in enumerate(desired_ids)}) ; result = cursor.execute(query, params)` The sqlite3 module is perfectly happy with things like `:0`, `:1`, `:2` as string substitution parameters. (Stack overflow really murders the code formatting in comments; sorry that's so hard to read.) – geekofalltrades Jun 04 '15 at 18:40
  • 2
    Thank you for that answer, geekofalltrades. I really wish that could be a legible, top-level answer, to go with the others that can't handle named parameters. – rspeer Apr 19 '19 at 21:11
29

According to http://www.sqlite.org/limits.html (item 9), SQLite can't (by default) handle more than 999 parameters to a query, so the solutions here (generating the required list of placeholders) will fail if you have thousands of items that you're looking IN. If that's the case, you're going to need to break up the list then loop over the parts of it and join up the results yourself.

If you don't need thousands of items in your IN clause, then Alex's solution is the way to do it (and appears to be how Django does it).

cibyr
  • 743
  • 1
  • 7
  • 12
  • 5
    Apparently this limit has increased to "32766 for SQLite versions after 3.32.0," according to the same link now. – yodavid Sep 11 '20 at 08:10
14

Update: this works:

import sqlite3

c = sqlite3.connect(":memory:")
c.execute('CREATE TABLE distro (id INTEGER PRIMARY KEY AUTOINCREMENT, name TEXT)')

for name in 'Ubuntu Fedora Puppy DSL SuSE'.split():
  c.execute('INSERT INTO distro (name) VALUES (?)', ( name,) )

desired_ids = ["1", "2", "5", "47"]
result_set = c.execute('SELECT * FROM distro WHERE id IN (%s)' % ("?," * len(desired_ids))[:-1], desired_ids)
for result in result_set:
  print result

The issue was that you need to have one ? for each element in the input list.

The statement ("?," * len(desired_ids))[:-1] makes a repeating string of "?,", then cuts off the last comma. so that there is one question mark for each element in desired_ids.

Mark Rushakoff
  • 249,864
  • 45
  • 407
  • 398
7

I always end up doing something like this:

query = 'SELECT * FROM distro WHERE id IN (%s)' % ','.join('?' for i in desired_ids)
c.execute(query, desired_ids)

There's no injection risk because you're not putting strings from desired_ids into the query directly.

John Fouhy
  • 41,203
  • 19
  • 62
  • 77
  • The values I'll be using in the IN clause actually come from a file exported from another system. I expect that the risk of injection is miniscule, but you never know when Bobby Tables will show up. – Clinton Blackmore Aug 21 '09 at 04:03
  • The risk of injecton is 0 because the only thing you're programatically putting into your query is a bunch of question marks. All a hypothetical attacker can do is control the number of question marks -- that's not an attack vector. The actual externally-supplied data is going through the ? parameter-passing mechanism as usual. – John Fouhy Aug 21 '09 at 04:25
1

I needed to use some other named parameters so I've developed two helper functions that may be worth sharing.

def prepare_list_query(name, values):
    """Prepare SQLite query with named parameters."""
    list_query = ", ".join(":{}_{}".format(name, i) for i in range(len(values)))
    return list_query


def prepare_list_dict(name, values):
    """Prepare SQLite dict with named parameters."""
    list_dict = {"{}_{}".format(name, i): value for i, value in enumerate(values)}
    return list_dict

# Usage:

desired_ids = ["1", "2", "5", "47"]
desired_types = ["active", "inactive"]

sql = "SELECT * FROM distro WHERE id IN ({}) AND type IN ({})".format(
    prepare_list_query("desired_id", desired_ids),
    prepare_list_query("desired_type", desired_types),
)
sql_dict = {"some": "other parameters you might need"}
sql_dict.update(prepare_list_dict("desired_id", desired_ids))
sql_dict.update(prepare_list_dict("desired_type", desired_types))

# # This results in:
# sql = "SELECT * FROM distro WHERE id IN (:desired_id_0, :desired_id_1, :desired_id_2, :desired_id_3) AND type IN (:desired_type_0, :desired_type_1)

# # and
# sql_dict = {
#     "some": "other parameters you might need",
#     "desired_id_0": "1",
#     "desired_id_1": "2",
#     "desired_id_2": "5",
#     "desired_id_3": "47",
#     "desired_type_0": "active",
#     "desired_type_1": "inactive",
# }

# And now execute the query:
result = c.execute(sql, sql_dict)
ababak
  • 1,685
  • 1
  • 11
  • 23
0

I wrote a very thin layer called notanorm to address this issue.

https://pypi.org/project/notanorm/

...and then your code looks like this:

import notanorm

c = notanorm.SqliteDb(":memory:")
c.query('CREATE TABLE distro (id INTEGER PRIMARY KEY AUTOINCREMENT, name TEXT)')

for name in 'Ubuntu Fedora Puppy DSL SuSE'.split():
  c.insert('distro', name=name)

desired_ids = ["1", "2", "5", "47"]
result_set = c.select('distro', id=desired_ids)

for result in result_set:
  print(result)

Which is both easy to read, and also allows you to switch databases some day.

{'id': 1, 'name': 'Ubuntu'}
{'id': 2, 'name': 'Fedora'}
{'id': 5, 'name': 'SuSE'}

Also consider looking at the heavier framework: https://www.sqlalchemy.org/

Erik Aronesty
  • 11,620
  • 5
  • 64
  • 44