I've been racking my brain for the past couple of days attempting to connect to a Hive server with a Python client using pyhive on Windows. I'm new to Hive (pyhive too for that matter), but am a reasonably experienced Python dev. Invariably I get the following error:
(pyhive-test) C:\dev\sandbox\pyhive-test>python test.py
Traceback (most recent call last):
File "test.py", line 3, in <module>
conn = hive.Connection(host='192.168.1.196', port='10000', database='default', auth='NONE')
File "C:\Users\harnerd\Anaconda3\envs\pyhive-test\lib\site-packages\pyhive\hive.py", line 192, in __init__
self._transport.open()
File "C:\Users\harnerd\Anaconda3\envs\pyhive-test\lib\site-packages\thrift_sasl\__init__.py", line 84, in open
raise TTransportException(type=TTransportException.NOT_OPEN,
thrift.transport.TTransport.TTransportException: Could not start SASL: b'Error in sasl_client_start (-4) SASL(-4): no mechanism available: Unable to find a callback: 2'
when executing the following script:
from pyhive import hive
conn = hive.Connection(host='192.168.1.196', port='10000', database='default', auth='NONE')
cur = conn.cursor()
cur.execute('show tables')
data = cur.fetchall()
print(data)
The HiveServer2 instance is an out-of-the-box HDP Sandbox VM from Cloudera with HiveServer2 Authentication set to 'None'.
Client is an Anaconda virtual environment on Windows 10 with Python 3.8.5 and the following packages installed by conda:
- pyhive 0.6.1
- sasl 0.2.1
- thrift 0.13.0
- thrift-sasl 0.4.2
Right now I'm merely trying to connect to Hive with the script above, but ultimately I intend to use pyhive within SQLAlchemy in a Flask application. In other words: Flask -> Flask-SQLAlchemy -> SQLAlchemy -> pyhive. In production the Flask app will be hosted by Cloudera Data Science Workbench (i.e. some flavor of Linux), but will be developed (and therefore must also run) on Windows systems.
Of course I've looked at the many questions here, on Cloudera's site, and GitHub relating to Hive connection problems and if someone put a gun to my head I would have to say that trying this from a Windows client may be part of the problem as that doesn't seem to be a very common thing to do.
No mechanism available
What does that error even mean? It sure would be nice if there was some documentation on how to configure and use SASL from python - if there is I would like to know about it.
FWIW, the line causing the error is in thrift_sasl/__init__.py
:
ret, chosen_mech, initial_response = self.sasl.start(self.mechanism)
self.mechanism
is 'PLAIN'; chosen_mech
and initial_response
are empty strings (''). ret
is False, which causes the exception to be thrown.
I know I'm not the only guy trying to connect to Hive with pyhive on Windows - this guy (SASL error when trying to connect to hive(hue) by python from my PC - Windows10) was, but his 'solution' - install Ubuntu as a VM on his Windows box - isn't going to work for me.