4

I tried to set a hive connection as described here: How to Access Hive via Python? using the hive. Connection with python 3.5.2 (installed on a cloudera Linux BDA) but the SASL package seems to cause a problem. I saw on a forum that SASL is compatible only with 2.7 python. Is that right? What did I miss/do wrong?

from pyhive import hive
conn = hive.Connection(host="myserver", port=10000)
import pandas as pd

Error message

TTransportException Traceback (most recent call last)
in ()
1 from pyhive import hive
2 #conn = hive.Connection(host="myserver", port=10000)
----> 3 conn = hive.Connection(host="myserver")
4 import pandas as pd

/opt/anaconda3/lib/python3.5/site-packages/pyhive/hive.py in init(self, host, port, username, database, auth, configuration)
102
103 try:
--> 104 self._transport.open()
105 open_session_req = ttypes.TOpenSessionReq(
106 client_protocol=protocol_version,

/opt/anaconda3/lib/python3.5/site-packages/thrift_sasl/init.py in open(self)
70 if not ret:
71 raise TTransportException(type=TTransportException.NOT_OPEN,
---> **72 message=("Could not start SASL: %s" % self.sasl.getError()))**
73
74 # Send initial response

TTransportException: TTransportException(message="Could not start SASL: b'Error in sasl_client_start (-4) SASL(-4): no mechanism available: No worthy mechs found'", type=1)
CDspace
  • 2,639
  • 18
  • 30
  • 36
Thomas Bury
  • 138
  • 1
  • 2
  • 8

2 Answers2

2

We (I should say, IT-team) find a solution

Upgrade of python packages thrift (to version 0.10.0) and PyHive (to version 0.3.0) don’t know why the version we used wasn’t the latest.

Added the following:

<property>
<name>hive.server2.authentication</name>
<value>NOSASL</value>
</property>

To the following Hive config parameters in Cloudera Manager:

HiveServer2 Advanced Configuration Snippet (Safety Valve) for hive-site.xml Hive Client Advanced Configuration Snippet (Safety Valve) for hive-site.xml necessary so that HUE would work

from pyhive import hive
conn = hive.Connection(host="myserver", auth='NOSASL')
import pandas as pd
import sys

df = pd.read_sql("SELECT * FROM my_table", conn) 
print(sys.getsizeof(df))
df.head()

worked without problem/error.

Best, Tom

Thomas Bury
  • 138
  • 1
  • 2
  • 8
  • my hive.server2.authentication is set to CUSTOM, so what i need to do?? and how? – Indrajeet Gour Jul 06 '17 at 10:28
  • following the pyhive code: `raise NotImplementedError( "Only NONE, NOSASL, LDAP, KERBEROS " "authentication are supported, got {}".format(auth))` so actually I don't know. Sorry, not helpful – Thomas Bury Jul 07 '17 at 11:04
1

check if you have all the dependencies installed :

gcc-c++
python-devel.x86_64
cyrus-sasl-devel.x86_64

(assuming youre on windows)

Drahoš Maďar
  • 517
  • 2
  • 6
  • 22
  • Futher info: gcc and cyrus-sasl-devel are installed. python-devel as well (but the 2.6 version as that is the version shipped with the operating system, cloudera suite for the BDA). We’re not using the systems Python though, but the one supplied by Anaconda. – Thomas Bury Jun 14 '17 at 06:57
  • python 3.5.2 Anaconda custom (64-bit) [GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] – Thomas Bury Jun 14 '17 at 13:43