1

I am working on a function which requires to query several database tables (these are all SELECT statements), for ORM library and database driver I use SQLAlchemy and PyMySQL.

Initially I built the function entirely at ORM level, however I hit performance issue due to several round trips between my python application server and database server every time when running the function , since then I came up with a new approach to send multiple SQL statements in one flight to database server at SQLAlchemy Core level.

Here is simplified version of my new approach :

from sqlalchemy.sql import select as sa_select
db_engine = sqlalchemy_engin_init()

def my_db_query_fn(self, user_input_1, user_input_2):
    # note that user_input_1, user_input_2 are from untrusted user input
    with db_engine.connect() as conn:
        # raw DBAPI connection for multiple result sets in one go
        dbapi_conn = conn.connection
        # cursor comes from low-level database driver, not standard SQLAlchemy class
        cursor = dbapi_conn.cursor()
        try:
            table_a = Model_A.__table__
            table_b = Model_B.__table__
            stmt  = sa_select(table_a.c.column1, table_a.c.column2
                        ).where(table_a.c.column3 == user_input_1)
            stmt2 = sa_select(table_b.c.column1, table_b.c.column2
                        ).where(table_b.c.column3 == user_input_2)
            sql_compile_fn = lambda s: str(s.compile(conn.engine, compile_kwargs={"literal_binds": True})) 
            # generate raw SQL statements by running compile()
            rawsqls = list(map(sql_compile_fn, [stmt, stmt2]))
            # send all the generated raw SQL statements in one round trip
            cursor.execute(';'.join(rawsqls))
            # then retrieve each result set one after the other, this part is skipped ....
        finally:
            cursor.close()

After testing , the code above does reduce the round trip latency, but my question is : is my new approach vulnerable to SQL injection attack. Here are what I've been considering :

  • As commented in the code sample above, the function my_db_query_fn() necessarily takes untrusted user inputs user_input_1 and user_input_2 to build all SQL statements, even all the raw SQL strings are generated by sqlalchemy.sql.selectable.Select.compile() instead of directly formatting raw SQL string patterns with the untrusted user inputs.
  • the function my_db_query_fn() executes SQL statements by cursor.execute() not conn.execute(). The cursor comes from low-level database driver (in my case it is PyMySQL) , I don't know if that is considered as bad practice to avoid SQL injection. (Or cursor.execute() and conn.execute() are never related to that ?)
  • According to the similar question on stackoverflow and official documentation , the function YOUR_MODEL.__table__.select() in SQLAlchemy ORM and the function sqlalchemy.sql.expression.select(YOUR_MODEL) in SQLAlchemy Core actually generate instances of the same class type sqlalchemy.sql.selectable.Select , and the SELECT statement instance can be passed to either conn.execute() or session.execute() . does that implicitly mean SQLAlchemy Core will also check against SQL injection ?

thanks for reading, I appreciate your feedback, hints, or tips

Ham
  • 703
  • 8
  • 17
  • [This warning](https://docs.sqlalchemy.org/en/14/faq/sqlexpressions.html#rendering-bound-parameters-inline) seems very clear that this technique is potentially vulnerable to SQL injection. – snakecharmerb Jan 16 '22 at 12:26

0 Answers0