Questions tagged [spark-thriftserver]

95 questions
16
votes
3 answers

How to connect to remote hive server from spark

I'm running spark locally and want to to access Hive tables, which are located in the remote Hadoop cluster. I'm able to access the hive tables by lauching beeline under SPARK_HOME [ml@master spark-2.0.0]$./bin/beeline Beeline version 1.2.1.spark2…
April
  • 819
  • 2
  • 12
  • 23
9
votes
1 answer

Why spark executor cores are not equal with active tasks in spark web UI?

I'm using Spark 2.3 thrift server for Ad-hoc Sql queries. My spark parameters are set as below in spark-defaults.conf file: spark.executor.memory 24G spark.executor.cores 40 spark.executor.instances 3 However when I checked the spark web ui,the…
louis lau
  • 161
  • 3
  • 11
8
votes
1 answer

Spark build in hive MySQL metastore isn't being used

I'm using Apache Spark 2.1.1 and I have put the following hive-site.xml on $SPARK_HOME/conf folder: javax.jdo.option.ConnectionURL
6
votes
0 answers

spark thrift server does not clean shuffle files

We are running SQL queries against Spark EMR cluster using Spark Thrift Server and we see that when a SQL query (translated to Spark job) is finished, it's shuffle files located under /mnt/yarn/usercache/root/appcache are not cleaned. This causes No…
5
votes
2 answers

How to access custom UDFs through Spark Thrift Server?

I am running Spark Thrift Server on EMR. I start up the Spark Thrift Server by: sudo -u spark /usr/lib/spark/sbin/start-thriftserver.sh --queue interactive.thrift --jars /opt/lib/custom-udfs.jar Notice that I have a customer UDF jar and I want…
4
votes
2 answers

How to run Spark SQL Thrift Server in local mode and connect to Delta using JDBC

I'd like connect to Delta using JDBC and would like to run the Spark Thrift Server (STS) in local mode to kick the tyres. I start STS using the following command: $SPARK_HOME/sbin/start-thriftserver.sh \ --conf…
4
votes
1 answer

Why is my Spark Thrift server very slow with HTTP?

My organisation set up a Spark Thrift server that is configured to use SSL over HTTP. The intent is to enable Power BI to retrieve data via Spark securely. However, simply retrieving schema information can take up to 10 minutes, and a further 10+…
QA Collective
  • 2,222
  • 21
  • 34
4
votes
1 answer

Power BI & Spark - ODBC: ERROR [HY000] [Microsoft][ThriftExtension] (4)

I am connecting Power BI to Spark but getting this error after attempting connection: Details: "ODBC: ERROR [HY000] [Microsoft][ThriftExtension] (4) Error occurred while contacting server: SSL_read: error code: 0. The connection has been…
4
votes
1 answer

How to register custom UDF jar in HiveThriftServer2?

In HiveThriftServer2 class, what is the difference between calling the startWithContext vs calling the main? I have a customer UDF jar that I want to register, so that every time when the thrift server boots up, all these are auto configure. Is…
seamonkeys
  • 123
  • 4
4
votes
2 answers

Starting thrift server in spark

Can anyone help me with starting spark thrift server? I am running my script in standalone mode and I want to fetch data in my business intelligence tool. In order to do that I need to start thrift server. I tried running shell script:…
Bhanuday Birla
  • 969
  • 1
  • 10
  • 23
3
votes
0 answers

How to set row batch size for incrementalCollect in Apache Spark Thrift server?

I enabled spark.sql.thriftServer.incrementalCollect in my Thrift server (Spark 3.1.2) to prevent OutOfMemory exceptions. This worked fine, but my queries are really slow now. I checked the logs and found that Thrift is querying batches of 10.000…
3
votes
1 answer

Get number of rows read in Spark thrift server query through Listener

I'm trying to build a monitoring system for our ST server. So far something like logging query, rows retrieved/red and time spent will be fine. I have implemented a custom Listener, I am able to retrieve query and time without problem, listening…
SCouto
  • 7,808
  • 5
  • 32
  • 49
3
votes
1 answer

Unused spark worker

I have configured standalone spark cluster connected to Cassandra cluster with 1 master, 1 slave and Thrift server which is used as JDBC connector for Tableau application. Slave appears in workers list anyway when I launch any query worker does not…
stebetko
  • 735
  • 1
  • 8
  • 24
3
votes
0 answers

Spark failure detection - why datanode not send heartbeat to the master machine ( driver )

as all know the heartbeat is a signal sent periodically in order to indicate normal operation of the node or synchronize with other parts of the system in our system we have 5 workers machine , while executes run on 3 of them our system include 5…
Judy
  • 1,595
  • 6
  • 19
  • 41
3
votes
3 answers

Intercept and modify incoming SQL queries to Spark Thrift Server

I have a thrift server up and running, with users sending queries over a JDBC connection. Can I intercept and modify the queries as they come in, and then send the result of the modified query back to the user? For example - I want the user to be…
bendl
  • 1,583
  • 1
  • 18
  • 41
1
2 3 4 5 6 7