0

I am running a query through SQL Alchemy via Pandas using pd.read_sql_query(sql=sql, con=conn) method.

My query is valid sql and my connection can be a connection to any database (Postgres, Mysql, MS Sql, Oracle etc.)

Sometimes this query takes a lot of time (on the order of minutes) and I would like to set a timeout for this specific query to 5 seconds only.

I retrieve conn object from sqla's engine, that's initiated using sqla's create_engine method. a simplified version of that code looks like this:

engine = self.database.get_sqla_engine()  # this method has a call to create_engine
conn = engine.connect()

So far, I have tried using conn.execute('SET statement_timeout="5s"') but this works only for postgres.

I have also tried creating Timer thread like this:

 t = threading.Timer(5, conn.connection.cancel)
 t.start

which I later catch like this:

 try:
        df = pd.read_sql_query(sql=sql, con=conn)
        return [row[0] for row in df.to_records(index=False)]

 except DBAPIError, e:
        if type(e.orig) == QueryCanceledError:
            logging.info('Long running query was cancelled.')
            return []

Problem with above solution is, it only works when I am running flask app in debug mode. On production mode (using gunicorn) the timer thread doesn't call the cancel method. (My guess is because program is already waiting for that query result from sqla)

Any solutions are welcome.

EDIT: Added details as to how connection object is being created and where does sqla comes into play.

Jazib
  • 1,200
  • 1
  • 16
  • 39
  • 1
    The solution is dependent on the DB dialect being used. I don't think there is a single uniform solution. The sql sent to achieve this will need to change based on the dialect. Threading probably isn't a viable solution here, either. [This answer](https://stackoverflow.com/a/32825482/5747944) has more details about threading inside Flask and why it can break when you switch to production. – sytech Sep 25 '18 at 13:01
  • 1
    Also, query optimizers perform differently even for same SQL query depending on data, indexes, and joins. Do you really have the same database schema AND data across all those RBDMS's? Also, where are you using `sqlalchemy`? Please shower fuller code like assignment of *sql* and *conn*. – Parfait Sep 25 '18 at 13:51
  • @Parfait I have made changes to original question to incorporate these details. Also, your first question is answered by the use of sqla, it's DBMS agnostic and allows a high level API to query these dbs – Jazib Sep 26 '18 at 06:56

0 Answers0