3

I am using Python and PyMySQL. I want to fetch a number of items from a MySQL database according to their ids:

items_ids = tuple([3, 2])
sql = f"SELECT * FROM items WHERE item_id IN {items_ids};"

I am using the formatted string literals (f" ", https://docs.python.org/3/whatsnew/3.6.html#whatsnew36-pep498) to evaluate the tuple inside the SQL statement.

However,I want to get back the items in the order specified by the tuple so firstly the item with item_id = 3 and then the item with item_id = 2. To accomplish this I have to use the ORDER BY FIELD clause (see also here: Ordering by the order of values in a SQL IN() clause).

But if I write something like this:

items_ids = tuple([3, 2])
sql = f"SELECT * FROM items WHERE item_id IN {items_ids} ORDER BY FIELD{(item_id,) + items_ids};"

then item_id in the ORDER BY FIELD clause is considered as an undeclared python variable

and if I write something like this:

items_ids = tuple([3, 2])
sql = f"SELECT * FROM items WHERE item_id IN {items_ids} ORDER BY FIELD{('item_id',) + items_ids};"

then item_id in the ORDER BY FIELD clause is considered as a string and not as a SQL variable and in this case ORDER BY FIELD does not do anything.

How can I evaluate the tuple (item_id,) + items_ids in the SQL statement by maintaining item_id as a SQL variable in the ORDER BY FIELD clause?

Obviously I can sort the items after they have returned from the database according to items_ids and without bothering so much with MySQL but I was just wondering how to do this.

Outcast
  • 4,967
  • 5
  • 44
  • 99
  • Can you try this sql = f"SELECT * FROM items WHERE item_id IN {items_ids} ORDER BY FIELD(item_id, {items_ids});" if you need item_ids in order you can use sql = f"SELECT * FROM items WHERE item_id IN {items_ids} ORDER BY FIELD(item_id, {sorted(items_ids)});" – Nishant Patel May 31 '18 at 13:23
  • Thank you for your comment. `items_ids = tuple([3, 2])` is already a tuple so `(item_id, {items_ids})` throws an error for the SQL query. – Outcast May 31 '18 at 14:04

3 Answers3

11

Please don't use f-strings, or any string formatting, for passing values to SQL queries. That's the road to SQL injection. Now you may be thinking: "it's a tuple of integers, what bad could happen?" First of all a single element Python tuple's string representation is not valid SQL. Secondly, someone may follow the example with user controllable data other than tuples of ints (so having these bad examples online perpetuates the habit). Also the reason why you have to resort to your "cunning" solution is using the wrong tools for the job.

The correct way to pass values to SQL queries is to use placeholders. In case of pymysql the placeholder is – a bit confusingly – %s. Don't mix it with manual %-formatting. In case of having to pass a variable amount of values to a query you do have to resort to some string building, but you build the placeholders, not the values:

item_ids = (3, 2)
item_placeholders = ', '.join(['%s'] * len(item_ids))

sql = f"""SELECT * FROM items
          WHERE item_id IN ({item_placeholders})
          ORDER BY FIELD(item_id, {item_placeholders})"""

# Produces:
#
#     SELECT * FROM items
#     WHERE item_id IN (%s, %s)
#     ORDER BY FIELD(item_id, %s, %s)

with conn.cursor() as cur:
    # Build the argument tuple
    cur.execute(sql, (*item_ids, *item_ids))
    res = cur.fetchall()
Ilja Everilä
  • 50,538
  • 7
  • 126
  • 127
  • Thank you for your answer(upvote). I posted also an answer with using `.format()` and `f-strings`. – Outcast Jun 01 '18 at 10:12
2

Another simpler way to resolve this single element tuple problem is by checking the length of the element by keeping it into list and keeping it as a list rather than passing it as a tuple to cursor param:

eg:

 if (len(get_version_list[1])==1):
                port_id=str(port_id[0])
                port_id = '(' + "'" + port_id + "'" + ')'

            else:
                port_id=tuple(port_id)

pd.read_sql(sql=get_version_str.format(port_id,src_cd), con=conn)

By using above code simply you won't get (item_id,) this error in sql further:)

Aurora
  • 81
  • 6
-3

A solution with .format() is the following:

items_ids = tuple([3, 2])
items_placeholders = ', '.join(['{}'] * len(items_ids))

sql = "SELECT * FROM items WHERE item_id IN {} ORDER BY FIELD(item_id, {});".format(items_ids, items_placeholders).format(*items_ids)

# with `.format(items_ids, items_placeholders)` you get this: SELECT * FROM items WHERE item_id IN (3, 2) ORDER BY FIELD(item_id, {}, {});
# and then with `.format(*items_ids)` you get this: SELECT * FROM items WHERE item_id IN (3, 2) ORDER BY FIELD(item_id, 3, 2);

A rather tricky solution with f-strings is the following:

sql1 = f"SELECT * FROM items WHERE item_id IN {item_ids} ORDER BY FIELD(item_id, "
sql2 = f"{items_ids};"
sql = sql1 + sql2[1:]

# SELECT * FROM items WHERE item_id IN (3, 2) ORDER BY FIELD(item_id, 3, 2);

But as @IIija mentions, I may get a SQL injection with it because IN {item_ids} cannot accommodate one-element tuples as such.

Additionally, using f-strings to unpack tuples in strings is perhaps more difficult than using .format() as others have mentioned before (Formatted string literals in Python 3.6 with tuples) since you cannot use * to unpack a tuple within a f-string. However, perhaps you may come up with a solution for this (which is using a iterator?) to produce this

sql = f"SELECT * FROM items WHERE item_id IN ({t[0]}, {t[1]}) ORDER BY FIELD(item_id, {t[0]}, {t[1]});"

even though I do not have the solution for this in my mind right now. You are welcome to post a solution of this kind if you have it in your mind.

Outcast
  • 4,967
  • 5
  • 44
  • 99
  • The 1-tuple case fails in `f"... IN {item_ids} ..."`, because the string representation of a Python tuple includes the comma. The point about SQL injection was that your particular case is possibly "safe", but that depends on the source of `items_ids` and it not allowing anything other than tuples of integers (not for example mixed tuples of ints and strings, from the user). On the other hand when using the driver's placeholders you simply don't have to worry about such things. Depending on your driver you get nice data conversions for free as well. – Ilja Everilä Jun 01 '18 at 10:24
  • By the way, if you can instantly see how you can built `sql = f"SELECT * FROM items WHERE item_id IN ({t[0]}, {t[1]}) ORDER BY FIELD(item_id, {t[0]}, {t[1]});"` in order to actually use the `f-strings` without risking any SQL injection then let me know please. Even though I suspect that this solution will rather more complex than yours and mine with `.format()` – Outcast Jun 01 '18 at 10:36
  • The way to build safe SQL query strings using f-strings would be to quote the passed in values manually: `f"SELECT * FROM items WHERE item_id IN ({quote(t[0])}, {quote(t[1])})"`, but that gets annoying real fast and it is really easy to forget or get wrong (see: PHP). Or you could argue, that you're only using `int` values, so it's ok to just format them in... until they aren't. The way to avoid the risk of SQL injection is to use parameterized queries and let the DB-API driver handle the values. – Ilja Everilä Jun 01 '18 at 11:27
  • Yes, ok manually is obvious. But then obviously you cannot handle tuples of different lengths (without modifying again things manually). – Outcast Jun 01 '18 at 11:35