1

I have built an API in Flask that performs classification on text messages with Keras. I am currently using sshtunnel and MySQLdb to connect to a MySQL database to fetch messages from a remote database. The entire application is wrapped in a Docker container.

I am able to establish a connection to the remote database and successfully query it, but I am opening and closing a new ssh tunnel every time a POST request comes into the API, and this slows down performance.

I have tried to open a single ssh tunnel and database connection "to rule them all", but the connection gets stale if there is no activity after an hour or so, and then API requests take forever and a day to complete.

How have you done this? Is this slowness unavoidable or is there a way to periodically refresh the ssh and database connections?

This is how I am connecting to my database for every incoming request:

with SSHTunnelForwarder(
          (host, 22),
          ssh_username=ssh_username,
          ssh_private_key=ssh_private_key,
          remote_bind_address=(localhost, 3306)
     ) as server:
          conn = db.connect(host=localhost,
          port=server.local_bind_port,
          user=user,
          passwd=password,
          db=database)
Gabriel
  • 219
  • 2
  • 13
  • 1
    If you really have to do this, why not open the tunnel outside of the application `ssh -L 3306:localhost:3306 user@remote` and then point your application at `localhost:3306`? If you really care about performance though, you should put your database and your application on the same network if possible. – Suever Oct 06 '17 at 13:38

1 Answers1

2

Okay, I figured it out. I created a DB object as suggested in this answer but with a slight modification. I kept track of the time that the connection to the database was created and then re-established the connection every 30 minutes. This means that one or two queries take slightly longer because I am rebuilding the connection to the database, but the rest of them run much faster and the connection won't go stale.

I've included some code below. I realize the code isn't perfect, but it's what has worked for me so far.

import MySQLdb as mydb
import time
import pandas as pd
from sshtunnel import SSHTunnelForwarder

class DB:

    def __init__(self):
        self.open_ssh_tunnel()

        self.conn = None

        self.server = None

        self.connect()

        self.last_connected_time = time.time()


    def open_ssh_tunnel(self):
        connection_success = False

        while not connection_success:
            try:
                self.server = SSHTunnelForwarder(
                        (host, 22),
                        ssh_username=ssh_username,
                        ssh_private_key=ssh_private_key,
                        ssh_password=ssh_pwd,
                        remote_bind_address=(localhost, 3306))
                connection_success = True
            except:
                time.sleep(0.5)

        self.server.start()


    def connect(self):
        connection_success = False

        while not connection_success:
            try:
                self.conn = mydb.connect(host=localhost,
                        port=server.local_bind_port,
                        user=user,
                        passwd=password,
                        db=database)
                connection_success = True
            except:
                time.sleep(0.5)


    def query(self, sql):

        result = None
        current_time = time.time()

        if current_time - self.last_connected_time > 1600:
            self.last_connected_time = current_time
            self.server.close()
            self.conn.close()
            self.open_ssh_tunnel()
            self.connect()
        try:
            result = pd.read_sql_query(sql, self.conn).values
            self.conn.commit()
        except:
            self.server.close()
            self.conn.close()
            self.open_ssh_tunnel()
            self.connect()
            result = pd.read_sql_query(sql, self.conn).values

        return result  
Gabriel
  • 219
  • 2
  • 13