guys! Hope someone can help me with this issue.
I executing a query through SQLAlchemy that returns ~6kk rows (it's historical data) that I need to process on a python script. I have some functions to read and do some processing on the data using pandas dataframe.Here are the functions:
def consulta_db_cancelamentos(db_con, query):
engine = create_engine(db_con, pool_recycle=3600)
con = engine.connect()
query_result = con.execution_options(stream_results=True).execute(query)
query_result_list = []
while True:
rows = query_result.fetchmany(10000)
if not rows:
break
for row in rows:
data = row['data'],\
row['plano'],\
row['usuario_id'],\
row['timestamp_cancelamentos'],\
row['timestamp'],\
row['status']
query_result_list.append(data)
df = pd.DataFrame()
if df.empty:
df = pd.DataFrame(query_result_list)
else:
df.append(pd.DataFrame(query_result_list))
df_cor = corrige_cancelamentos_df(df, '2017-01-01', '2017-12-15')
con.close()
return df_cor
As you can see, I'm already trying to read the data and process/store it in 10k rows chunk. When I try to execute the whole script I got this error on the function (I'm also including the error raised on the main()
):
Traceback (most recent call last):
File "/home/aiquis/.local/lib/python3.5/site-packages/sqlalchemy/engine/result.py", line 1159, in fetchmany
l = self.process_rows(self._fetchmany_impl(size))
File "/home/aiquis/.local/lib/python3.5/site-packages/sqlalchemy/engine/result.py", line 1318, in _fetchmany_impl
row = self._fetchone_impl()
File "/home/aiquis/.local/lib/python3.5/site-packages/sqlalchemy/engine/result.py", line 1308, in _fetchone_impl
self.__buffer_rows()
File "/home/aiquis/.local/lib/python3.5/site-packages/sqlalchemy/engine/result.py", line 1295, in __buffer_rows
self.__rowbuffer = collections.deque(self.cursor.fetchmany(size))
File "/usr/local/lib/python3.5/dist-packages/pymysql/cursors.py", line 485, in fetchmany
row = self.read_next()
File "/usr/local/lib/python3.5/dist-packages/pymysql/cursors.py", line 446, in read_next
return self._conv_row(self._result._read_rowdata_packet_unbuffered())
File "/usr/local/lib/python3.5/dist-packages/pymysql/connections.py", line 1430, in _read_rowdata_packet_unbuffered
packet = self.connection._read_packet()
File "/usr/local/lib/python3.5/dist-packages/pymysql/connections.py", line 1008, in _read_packet
recv_data = self._read_bytes(bytes_to_read)
File "/usr/local/lib/python3.5/dist-packages/pymysql/connections.py", line 1037, in _read_bytes
CR.CR_SERVER_LOST, "Lost connection to MySQL server during query")
pymysql.err.OperationalError: (2013, 'Lost connection to MySQL server during query')
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/aiquis/EI/cancelamentos_testes5.py", line 180, in <module>
main()
File "/home/aiquis/EI/cancelamentos_testes5.py", line 164, in main
cancelamentos_df_corrigido = consulta_db_cancelamentos(db_param, query_cancelamentos)
File "/home/aiquis/EI/cancelamentos_testes5.py", line 14, in consulta_db_cancelamentos
rows = query_result.fetchmany(1000)
File "/home/aiquis/.local/lib/python3.5/site-packages/sqlalchemy/engine/result.py", line 1166, in fetchmany
self.cursor, self.context)
File "/home/aiquis/.local/lib/python3.5/site-packages/sqlalchemy/engine/base.py", line 1413, in _handle_dbapi_exception
exc_info
File "/home/aiquis/.local/lib/python3.5/site-packages/sqlalchemy/util/compat.py", line 203, in raise_from_cause
reraise(type(exception), exception, tb=exc_tb, cause=cause)
File "/home/aiquis/.local/lib/python3.5/site-packages/sqlalchemy/util/compat.py", line 186, in reraise
raise value.with_traceback(tb)
File "/home/aiquis/.local/lib/python3.5/site-packages/sqlalchemy/engine/result.py", line 1159, in fetchmany
l = self.process_rows(self._fetchmany_impl(size))
File "/home/aiquis/.local/lib/python3.5/site-packages/sqlalchemy/engine/result.py", line 1318, in _fetchmany_impl
row = self._fetchone_impl()
File "/home/aiquis/.local/lib/python3.5/site-packages/sqlalchemy/engine/result.py", line 1308, in _fetchone_impl
self.__buffer_rows()
File "/home/aiquis/.local/lib/python3.5/site-packages/sqlalchemy/engine/result.py", line 1295, in __buffer_rows
self.__rowbuffer = collections.deque(self.cursor.fetchmany(size))
File "/usr/local/lib/python3.5/dist-packages/pymysql/cursors.py", line 485, in fetchmany
row = self.read_next()
File "/usr/local/lib/python3.5/dist-packages/pymysql/cursors.py", line 446, in read_next
return self._conv_row(self._result._read_rowdata_packet_unbuffered())
File "/usr/local/lib/python3.5/dist-packages/pymysql/connections.py", line 1430, in _read_rowdata_packet_unbuffered
packet = self.connection._read_packet()
File "/usr/local/lib/python3.5/dist-packages/pymysql/connections.py", line 1008, in _read_packet
recv_data = self._read_bytes(bytes_to_read)
File "/usr/local/lib/python3.5/dist-packages/pymysql/connections.py", line 1037, in _read_bytes
CR.CR_SERVER_LOST, "Lost connection to MySQL server during query")
sqlalchemy.exc.OperationalError: (pymysql.err.OperationalError) (2013, 'Lost connection to MySQL server during query') (Background on this error at: http://sqlalche.me/e/e3q8)
Exception ignored in: <bound method MySQLResult.__del__ of <pymysql.connections.MySQLResult object at 0x7f8c543dc198>>
Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/pymysql/connections.py", line 1345, in __del__
File "/usr/local/lib/python3.5/dist-packages/pymysql/connections.py", line 1447, in _finish_unbuffered_query
File "/usr/local/lib/python3.5/dist-packages/pymysql/connections.py", line 991, in _read_packet
File "/usr/local/lib/python3.5/dist-packages/pymysql/connections.py", line 1022, in _read_bytes
AttributeError: 'NoneType' object has no attribute 'settimeout'
[Finished in 602.4s]
The way I wrote consulta_db_cancelamentos
is already a result of some search on SO and SQLAlchemy documentation. Suppose I have no access to my MySQL Server administration.
When I limit my queryy to bring results for only one usuario_id
for example (something like ~50 rows) it works fine. I executed the same query on MySQL Workbench and the Duration/Fetch was 251.998 sec/357.541 sec