1

guys! Hope someone can help me with this issue.

I executing a query through SQLAlchemy that returns ~6kk rows (it's historical data) that I need to process on a python script. I have some functions to read and do some processing on the data using pandas dataframe.Here are the functions:

def consulta_db_cancelamentos(db_con, query):
    engine = create_engine(db_con, pool_recycle=3600)
    con = engine.connect()
    query_result = con.execution_options(stream_results=True).execute(query)
    query_result_list = []
    while True:
        rows = query_result.fetchmany(10000)
        if not rows:
            break
        for row in rows:
            data = row['data'],\
                row['plano'],\
                row['usuario_id'],\
                row['timestamp_cancelamentos'],\
                row['timestamp'],\
                row['status']

            query_result_list.append(data)

        df = pd.DataFrame()
        if df.empty:
            df = pd.DataFrame(query_result_list)
        else:
            df.append(pd.DataFrame(query_result_list))
        df_cor = corrige_cancelamentos_df(df, '2017-01-01', '2017-12-15')

    con.close()

    return df_cor

As you can see, I'm already trying to read the data and process/store it in 10k rows chunk. When I try to execute the whole script I got this error on the function (I'm also including the error raised on the main()):

Traceback (most recent call last):
  File "/home/aiquis/.local/lib/python3.5/site-packages/sqlalchemy/engine/result.py", line 1159, in fetchmany
    l = self.process_rows(self._fetchmany_impl(size))
  File "/home/aiquis/.local/lib/python3.5/site-packages/sqlalchemy/engine/result.py", line 1318, in _fetchmany_impl
    row = self._fetchone_impl()
  File "/home/aiquis/.local/lib/python3.5/site-packages/sqlalchemy/engine/result.py", line 1308, in _fetchone_impl
    self.__buffer_rows()
  File "/home/aiquis/.local/lib/python3.5/site-packages/sqlalchemy/engine/result.py", line 1295, in __buffer_rows
    self.__rowbuffer = collections.deque(self.cursor.fetchmany(size))
  File "/usr/local/lib/python3.5/dist-packages/pymysql/cursors.py", line 485, in fetchmany
    row = self.read_next()
  File "/usr/local/lib/python3.5/dist-packages/pymysql/cursors.py", line 446, in read_next
    return self._conv_row(self._result._read_rowdata_packet_unbuffered())
  File "/usr/local/lib/python3.5/dist-packages/pymysql/connections.py", line 1430, in _read_rowdata_packet_unbuffered
    packet = self.connection._read_packet()
  File "/usr/local/lib/python3.5/dist-packages/pymysql/connections.py", line 1008, in _read_packet
    recv_data = self._read_bytes(bytes_to_read)
  File "/usr/local/lib/python3.5/dist-packages/pymysql/connections.py", line 1037, in _read_bytes
    CR.CR_SERVER_LOST, "Lost connection to MySQL server during query")
pymysql.err.OperationalError: (2013, 'Lost connection to MySQL server during query')

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/aiquis/EI/cancelamentos_testes5.py", line 180, in <module>
    main()
  File "/home/aiquis/EI/cancelamentos_testes5.py", line 164, in main
    cancelamentos_df_corrigido = consulta_db_cancelamentos(db_param, query_cancelamentos)
  File "/home/aiquis/EI/cancelamentos_testes5.py", line 14, in consulta_db_cancelamentos
    rows = query_result.fetchmany(1000)
  File "/home/aiquis/.local/lib/python3.5/site-packages/sqlalchemy/engine/result.py", line 1166, in fetchmany
    self.cursor, self.context)
  File "/home/aiquis/.local/lib/python3.5/site-packages/sqlalchemy/engine/base.py", line 1413, in _handle_dbapi_exception
    exc_info
  File "/home/aiquis/.local/lib/python3.5/site-packages/sqlalchemy/util/compat.py", line 203, in raise_from_cause
    reraise(type(exception), exception, tb=exc_tb, cause=cause)
  File "/home/aiquis/.local/lib/python3.5/site-packages/sqlalchemy/util/compat.py", line 186, in reraise
    raise value.with_traceback(tb)
  File "/home/aiquis/.local/lib/python3.5/site-packages/sqlalchemy/engine/result.py", line 1159, in fetchmany
    l = self.process_rows(self._fetchmany_impl(size))
  File "/home/aiquis/.local/lib/python3.5/site-packages/sqlalchemy/engine/result.py", line 1318, in _fetchmany_impl
    row = self._fetchone_impl()
  File "/home/aiquis/.local/lib/python3.5/site-packages/sqlalchemy/engine/result.py", line 1308, in _fetchone_impl
    self.__buffer_rows()
  File "/home/aiquis/.local/lib/python3.5/site-packages/sqlalchemy/engine/result.py", line 1295, in __buffer_rows
    self.__rowbuffer = collections.deque(self.cursor.fetchmany(size))
  File "/usr/local/lib/python3.5/dist-packages/pymysql/cursors.py", line 485, in fetchmany
    row = self.read_next()
  File "/usr/local/lib/python3.5/dist-packages/pymysql/cursors.py", line 446, in read_next
    return self._conv_row(self._result._read_rowdata_packet_unbuffered())
  File "/usr/local/lib/python3.5/dist-packages/pymysql/connections.py", line 1430, in _read_rowdata_packet_unbuffered
    packet = self.connection._read_packet()
  File "/usr/local/lib/python3.5/dist-packages/pymysql/connections.py", line 1008, in _read_packet
    recv_data = self._read_bytes(bytes_to_read)
  File "/usr/local/lib/python3.5/dist-packages/pymysql/connections.py", line 1037, in _read_bytes
    CR.CR_SERVER_LOST, "Lost connection to MySQL server during query")
sqlalchemy.exc.OperationalError: (pymysql.err.OperationalError) (2013, 'Lost connection to MySQL server during query') (Background on this error at: http://sqlalche.me/e/e3q8)
Exception ignored in: <bound method MySQLResult.__del__ of <pymysql.connections.MySQLResult object at 0x7f8c543dc198>>
Traceback (most recent call last):
  File "/usr/local/lib/python3.5/dist-packages/pymysql/connections.py", line 1345, in __del__
  File "/usr/local/lib/python3.5/dist-packages/pymysql/connections.py", line 1447, in _finish_unbuffered_query
  File "/usr/local/lib/python3.5/dist-packages/pymysql/connections.py", line 991, in _read_packet
  File "/usr/local/lib/python3.5/dist-packages/pymysql/connections.py", line 1022, in _read_bytes
AttributeError: 'NoneType' object has no attribute 'settimeout'
[Finished in 602.4s]

The way I wrote consulta_db_cancelamentos is already a result of some search on SO and SQLAlchemy documentation. Suppose I have no access to my MySQL Server administration.

When I limit my queryy to bring results for only one usuario_id for example (something like ~50 rows) it works fine. I executed the same query on MySQL Workbench and the Duration/Fetch was 251.998 sec/357.541 sec

aiquis
  • 33
  • 1
  • 8

2 Answers2

1

Solved executing this command in MySQL Server:

set global max_allowed_packet = 67108864;

This solution was suggested here Lost connection to MySQL server during query

aiquis
  • 33
  • 1
  • 8
1

I faced a problem like this one. I faced problem after I put stream_results=True to the query.

For me, the parameter net_write_timeout was causing a trouble. When stream_results=True and the app does not immediately read all the data being sent from the MySQL server, then on the MySQL server side, the write communication (sending data packet) to the app gets blocked. net_write_timeout seems to be the parameter which controls how many seconds the connection is allowed for the MySQL server's write to the socket operation being blocked.

So, I've changed net_write_timeout parameter from 60 (default) to 3600 and solved the problem.

Yuki Inoue
  • 3,569
  • 5
  • 34
  • 53