1

I need to connect to the Hiveserver2 from Python 3.4.5 and the code is being executed on the hadoop cluster itself. Usually, I execute hive -e "some query" from command line. From other servers, we connect to Hiveserver2 using username only mechanism providing IP and port. However, that should not be necessary as here I am running the code on same server.

I have tried the following:

  1. Access Hive Data Using Python
  2. https://github.com/cloudera/impyla/issues/165
  3. How to connect to Hadoop Hive through python via pyhs2?
  4. https://pypi.python.org/pypi/impyla

but no success. I am getting error at connection stage itself. I can share errors if anyone requires.

If nothing else, it would be great if someone can elaborate on the answer to Hive client for Python 3.x

Drunk Knight
  • 131
  • 1
  • 2
  • 14

1 Answers1

1

Help from a friend and little bit tweaking of online available answers using impala.dbapi solved the issue:

from impala.dbapi import connect
conn = connect(host='localhost', port = 10000,auth_mechanism='PLAIN')
cursor = conn.cursor()
cursor.execute('show databases')
results = cursor.fetchall()
print(type(results))
print(results)
Drunk Knight
  • 131
  • 1
  • 2
  • 14