2

I get a NullPointerException after connecting BI tools like Redash or Superset to a Spark Thriftserver (both tools use PyHive). Apache Zeppelin works fine for queries using STS and I could never reproduce the error there (Zeppelin uses org.apache.hive.jdbc.HiveDriver).

DB engine Error
hive error: ('Query error', 'Error running query: java.lang.NullPointerException')

This sends the STS into a state where only a restart can bring it back. Queries from all clients will fail (Zeppelin, beeline, Redash, Superset). It seems to occur mostly when schema is automatically fetched (which doesn't quite work, DB name is fetched correctly, table names are wrong). While browsing PyHive code I encountered some incompatibilities between PyHive <-> STS (like this and this). The connection between Redash/Superset and STS works, I am able to do queries until the Thriftserver enters the broken state.

I understand why schema refresh doesn't work (and might be able work around it), but I don't understand why the Thriftserver enters an unrecoverable, broken state with the NullPointerException.

My setup:

  • Kubernetes
  • Delta Lake with data formatted as delta
  • Hive Metastore
  • Spark Cluster where a Spark Thriftserver is started: start-thriftserver.sh --total-executor-cores 3 --driver-memory 3G --executor-memory 1536M --hiveconf hive.server2.thrift.port 10000 --hiveconf hive.server2.thrift.max.worker.threads 2000 --hiveconf hive.server2.thrift.bind.host my-host (I also tried spark.sql.thriftServer.incrementalCollect=false but that didn't affect anything.)
  • Redash / Apache Superset connected to the STS
Jacek Laskowski
  • 72,696
  • 27
  • 242
  • 420
Daniel Müller
  • 426
  • 1
  • 5
  • 19
  • 1
    I have the same problem. It is interesting, that connections from DBeaver works (driver org.spark-project.hive:hive-jdbc:RELEASE) . But when I try to connect via ODBC, STS throws NPE and gets to unrecoverable state and I have to restart. – Tomas Bartalos Jun 10 '21 at 13:44
  • @TomasBartalos I tried for some time more and have given up on STS in the end. It appears that STS isn't as compatible as advertised and now I use Presto. – Daniel Müller Jun 29 '21 at 11:35

0 Answers0