23

I am using SQLAlchemy to connect to different databases in Python, but not with the ORM support as this cannot be implemented due to several reasons.

Mainly I do build a complex SQL query using things like

sql += "AND fieldname = '%s'" % myvar

In my case is not a problem of SQL injection as the data is always from a trusted source but even if the source is trusted it could contain characters that could break the query like ', % or _.

Mainly, I need to escape them, and I wonder if there is an already existing escape function that I could re-use.

sorin
  • 161,544
  • 178
  • 535
  • 806

6 Answers6

19

You should not try to implement your own escaping, but should instead use SQLAlchemy's builtin method:

sql = 'select * from foo where fieldname = :name'
result = connection.execute(sql, name = myvar)
BrtH
  • 2,610
  • 16
  • 27
  • 10
    @Ronny Firstly, his question was how to escape characters in an sql-query. This is the correct answer to that. Secondly, do you really think your comment is useful on a post from a year ago?? – BrtH Apr 26 '15 at 22:11
  • 33
    @BrtH it does not matter if it was a second or a year ago, the answer is still being read by other users. – SimonT Jul 11 '16 at 21:00
  • 6
    Well I'm still reading this today and I have no idea what you guys were talking about when you are saying the 'sql expression layer' – Majiick Nov 18 '18 at 22:47
  • 3
    Yeah, this solution doesn't allow someone to build an aggregate insert using the `INSERT INTO table_name (col1, col2) VALUES ('col1', 'col2)` format. =P For this, you really need to genuinely escape values prior to collecting them. – tpartee Apr 27 '19 at 01:19
  • 1
    @tpartee And for that as well there are better methods than manually escaping. – Ilja Everilä Oct 17 '20 at 07:26
  • I found this looking for an answer to a question "How to escape SQL strings manually" and the top answer is "don't"... It can by good advice but you could nevertheless write how to do it. –  Jan 27 '22 at 17:29
  • Is it possible to print sql with parameter values instead of executing? So that the printout would be "select * from foo where fieldname = 'myvar'"? – tok May 03 '22 at 11:32
16

In cases where one must explicitly escape a string, and the standard tools don't align with the requirement, you can ask SQLAlchemy to escape using an engine's dialect.

import sqlalchemy


engine = sqlalchemy.create_engine(...)
sqlalchemy.String('').literal_processor(dialect=engine.dialect)(value="untrusted value")

In my case, I needed to dynamically create a database (sqlalchemy-utils has this functionality but it failed in my case) according to user input.

edd
  • 1,307
  • 10
  • 10
  • 5
    Good answer but incomplete. Example doesn't actually do anything. it just puts "untrusted value" into " 'untrusted value' " with quotes. Pretty sure you need to clarify with a proper example not to mention doesn't explain how you would write a 'SELECT * FROM :table_dynamic" type thing. – Dexter Jan 25 '21 at 21:39
4

To extend @edd 's answer, which works in a limited capacity.

@edd provided:

import sqlalchemy

engine = sqlalchemy.create_engine(...)
sqlalchemy.String('').literal_processor(dialect=engine.dialect)(value="untrusted value")

If your "untrusted value" is a query you want to execute, this will end up a double-quoted string wrapping a single-quoted string, which you can't directly execute without stripping the quotes, i.e. "'SELECT ...'".

You can used sqlalchemy.Integer().literal_processor to do the same thing, but the result will not have the extra inner quotes, because it is intended to create an integer like 5 instead of a string like '5'. So your result will only be quoted once: "SELECT ...".

I found this Integer approach a little sketchy - is the person that reads my code going to know why I'm doing this? For psycopg2 at least, there is a more direct and clear approach.

If your underlying driver is psycopg2, you can use sqlalchemy to reach down into the driver, get the cursor, then use psycopg2's cursor.mogrify to bind & escape your query

from sqlalchemy.orm import sessionmaker

Session = sessionmaker(bind=engine)
session = Session()
cursor = session.connection().connection.cursor()
processed_query = cursor.mogrify([mogrify args, see docs]).decode("UTF-8")

I got how to grab the cursor from this answer: SQLAlchemy, Psycopg2 and Postgresql COPY

And mogrify from this answer: psycopg2 equivalent of mysqldb.escape_string?

My use case was building a query, then wrapping it in parantheses and aliasing like (SELECT ...) AS temp_some_table, in order to pass it to PySpark JDBC read. When SQLAlchemy builds the queries, it minimizes the parentheses, and so I could only get SELECT ... AS temp_some_table. I used the above approch to get what I need:

cursor = session.connection().connection.cursor()
aliased_query = cursor.mogrify(
    f"({query}) AS temp_{model.__tablename__}"
).decode("UTF-8")
francojposa
  • 185
  • 1
  • 2
  • 14
4

there also exists escape_like

from sqlalchemy_utils import escape_like


User.name.ilike(escape_like('John'))

So you can call escape_like for all your strings before formatting them to sql

Ryabchenko Alexander
  • 10,057
  • 7
  • 56
  • 88
1

You can use escape_string method from pymysql and then escape : so SQLAlchemy won't try to bind parameter for that variable, here is the sample

import MySQLdb
query = """ insert into.... values("{}"...) """.format(MySQLdb.escape_string(item).replace(':','\:'))

Note that if you use this way your code vulnerable to SQL Injection

to install pymysql

pip3 install pymysql
mirhossein
  • 682
  • 7
  • 16
  • 1
    Note that this may not be sufficient for all SQL dialects. For example, `escape_string` uses a backslash to escape quotes which doesn't work with SQLite. – totalhack Sep 11 '20 at 12:57
1

Compiling and completing other contributors' answers.

Writing direct SQL strings is often a bad solution, because each database system supports its own SQL dialect, and hence SQL strings are often not portable across databases.

To abstract this issue away from the users, SQLAlchemy invites you to write your SQL queries at a slightly higher level, in a more object-oriented fashion. It's called the SQL Expression language, and it's documented here: https://docs.sqlalchemy.org/en/13/core/tutorial.html

Basically, you can build expressions in Python that describe what you want to do, and SQLAlchemy will produce the corresponding SQL string for you, using the appropriate dialect for the database you're using.

Since you are familiar with SQL, you can learn this "mini-language" in a matter of hour(s) (I'm hesitant to put an "s" here).

And if I believe @BrtH, using this system will also escape strings for you, transparently. Escaping is hard to get right, and so it's always better to leave it to a mature system than trying to do it yourself.

Here's a random example of a select clause:

from sqlalchemy import select
...
ham_table = meta.tables['ham']
sel = select([ham_table.c.size, ham_table.c.weight]).where(ham_table.c.taste == 'yummy')
result = meta.bind.execute(sel)

Don't be thrown off by the ".c.", it's just a convention that helps them automate things nicely for you. They basically generate column descriptors for you, and store them under the .c field of your table object.

Ben017
  • 64
  • 5