I need to connect spark with powerbi. I don't know the required drivers for the same. And also i am running spark in local mode without installing apache hive. So I don't have hive-site.xml file for configuring thrift server. After starting thrift server I started $SPARK_HOME\bin\beeline.cmd and connected thrift server with command !connect jdbc:hive2://localhost:10000
and using userid as Administrator(same as my local machine) and blank password and the output was:
beeline> !connect jdbc:hive2://localhost:10000
Connecting to jdbc:hive2://localhost:10000
Enter username for jdbc:hive2://localhost:10000: Administrator
Enter password for jdbc:hive2://localhost:10000:
log4j:WARN No appenders could be found for logger (org.apache.hive.jdbc.Utils).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Connected to: Spark SQL (version 2.0.1)
Driver: Hive JDBC (version 1.2.1.spark2)
Transaction isolation: TRANSACTION_REPEATABLE_READ
It seems that the connection is made but when querying about databases with command: show databases;
, it is showing error (in beeline):
Error: org.apache.thrift.transport.TTransportException: java.net.SocketException: Software caused connection abort: socket write error (state=08S01,code=0)` and error(in thrift server cmd):`Exception in thread "HiveServer2-Handler-Pool: Thread-XXX"
Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "HiveServer2-Handler-Pool: Thread-XXX"
I don't understand this error. Please help me on this, and also I want to connect it with powerbi desktop installed on local machine. Can someone provide some links to read from for making the connection?