I'm running a long-ish insert query in Hive using PyHive 0.6.1 and it fails with thrift.transport.TTransport.TTransportException: TSocket read 0 bytes
after about 5 minutes running. On the server side the query keeps running until finishing successfully. I don't have this problem with fast queries.
I can't reproduce it locally on my Mac with the same python version: the code correctly waits untill the query finishes. The environment in which this happens is a Docker container based on python:3.6-slim. Among other things, i'm installing libsasl2-dev and libsasl2-modules packages, and pyhive[hive] python package.
Any clue why this is happening? Thanks in advance.
The code i'm using is:
import contextlib
from pyhive.hive import connect
def get_conn():
return connect(
host='my-host',
port=10000,
auth='NONE',
username='username',
database='database'
)
with contextlib.closing(get_conn()) as conn, \
contextlib.closing(conn.cursor()) as cur:
cur.execute('My long insert statement')
This is the full traceback
Traceback (most recent call last):
File "<stdin>", line 5, in <module>
File "/usr/local/lib/python3.6/site-packages/pyhive/hive.py", line 364, in execute
response = self._connection.client.ExecuteStatement(req)
File "/usr/local/lib/python3.6/site-packages/TCLIService/TCLIService.py", line 280, in ExecuteStatement
return self.recv_ExecuteStatement()
File "/usr/local/lib/python3.6/site-packages/TCLIService/TCLIService.py", line 292, in recv_ExecuteStatement
(fname, mtype, rseqid) = iprot.readMessageBegin()
File "/usr/local/lib/python3.6/site-packages/thrift/protocol/TBinaryProtocol.py", line 134, in readMessageBegin
sz = self.readI32()
File "/usr/local/lib/python3.6/site-packages/thrift/protocol/TBinaryProtocol.py", line 217, in readI32
buff = self.trans.readAll(4)
File "/usr/local/lib/python3.6/site-packages/thrift/transport/TTransport.py", line 60, in readAll
chunk = self.read(sz - have)
File "/usr/local/lib/python3.6/site-packages/thrift_sasl/__init__.py", line 166, in read
self._read_frame()
File "/usr/local/lib/python3.6/site-packages/thrift_sasl/__init__.py", line 170, in _read_frame
header = self._trans.readAll(4)
File "/usr/local/lib/python3.6/site-packages/thrift/transport/TTransport.py", line 60, in readAll
chunk = self.read(sz - have)
File "/usr/local/lib/python3.6/site-packages/thrift/transport/TSocket.py", line 132, in read
message='TSocket read 0 bytes')
thrift.transport.TTransport.TTransportException: TSocket read 0 bytes
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 5, in <module>
File "/usr/local/lib/python3.6/contextlib.py", line 185, in __exit__
self.thing.close()
File "/usr/local/lib/python3.6/site-packages/pyhive/hive.py", line 221, in close
response = self._client.CloseSession(req)
File "/usr/local/lib/python3.6/site-packages/TCLIService/TCLIService.py", line 218, in CloseSession
return self.recv_CloseSession()
File "/usr/local/lib/python3.6/site-packages/TCLIService/TCLIService.py", line 230, in recv_CloseSession
(fname, mtype, rseqid) = iprot.readMessageBegin()
File "/usr/local/lib/python3.6/site-packages/thrift/protocol/TBinaryProtocol.py", line 134, in readMessageBegin
sz = self.readI32()
File "/usr/local/lib/python3.6/site-packages/thrift/protocol/TBinaryProtocol.py", line 217, in readI32
buff = self.trans.readAll(4)
File "/usr/local/lib/python3.6/site-packages/thrift/transport/TTransport.py", line 60, in readAll
chunk = self.read(sz - have)
File "/usr/local/lib/python3.6/site-packages/thrift_sasl/__init__.py", line 166, in read
self._read_frame()
File "/usr/local/lib/python3.6/site-packages/thrift_sasl/__init__.py", line 170, in _read_frame
header = self._trans.readAll(4)
File "/usr/local/lib/python3.6/site-packages/thrift/transport/TTransport.py", line 60, in readAll
chunk = self.read(sz - have)
File "/usr/local/lib/python3.6/site-packages/thrift/transport/TSocket.py", line 132, in read
message='TSocket read 0 bytes')
thrift.transport.TTransport.TTransportException: TSocket read 0 bytes