Highest Voted 'spark-hive' Questions

35

votes

2 answers

Querying on multiple Hive stores using Apache Spark

I have a spark application which will successfully connect to hive and query on hive tables using spark engine. To build this, I just added hive-site.xml to classpath of the application and spark will read the hive-site.xml to connect to its…

apache-spark hive spark-hive

asked Sep 22 '15 at 10:22

karthik manchala

13,492
1
31
55

11

votes

3 answers

Apache spark Hive, executable JAR with maven shade

I'm building apache-spark application with Apache Spark Hive. So far everything was ok - I've been running tests and whole application in Intellij IDEA and all tests together using maven. Now I want to run whole application from bash and let it run…

maven apache-spark datanucleus maven-shade-plugin spark-hive

asked May 27 '16 at 13:01

anebril

276
2
12

10

votes

4 answers

Missing hive-site when using spark-submit YARN cluster mode

Using HDP 2.5.3 and I've been trying to debug some YARN container classpath issues. Since HDP includes both Spark 1.6 and 2.0.0, there have been some conflicting versions Users I support are successfully able to use Spark2 with Hive queries in YARN…

apache-spark hive hortonworks-data-platform spark-hive

asked Aug 03 '17 at 07:06

OneCricketeer

179,855
19
132
245

10

votes

4 answers

How to set hive.metastore.warehouse.dir in HiveContext?

I'm trying to write a unit test case that relies on DataFrame.saveAsTable() (since it is backed by a file system). I point the hive warehouse parameter to a local disk location: sql.sql(s"SET…

apache-spark apache-spark-sql spark-hive

asked May 28 '15 at 22:30

tribbloid

4,026
14
64
103

5

votes

2 answers

Spark hive udf: no handler for UDAF analysis exception

Created one project 'spark-udf' & written hive udf as below: package com.spark.udf import org.apache.hadoop.hive.ql.exec.UDF class UpperCase extends UDF with Serializable { def evaluate(input: String): String = { input.toUpperCase } Built…

scala apache-spark hive pyspark spark-hive

asked Sep 04 '18 at 10:45

Swapnil Chougule

717
9
17

5

votes

2 answers

How can I update/delete data in Spark-hive?

I don't think my title can explain the problem so here is the problem: Details build.sbt: name := "Hello" scalaVersion := "2.11.8" version := "1.0" libraryDependencies += "org.apache.spark" %% "spark-core" % "2.1.0" libraryDependencies +=…

java scala hive apache-spark-sql spark-hive

asked Apr 27 '17 at 08:06

yashpal bharadwaj

323
2
6
14

5

votes

2 answers

Select all except particular column in spark sql

I want to select all columns in a table except StudentAddress and hence I wrote following query: select `(StudentAddress)?+.+` from student; It gives following error in Squirrel Sql client. org.apache.spark.sql.AnalysisException: cannot resolve…

apache-spark apache-spark-sql hive spark-hive

asked Apr 26 '17 at 21:01

Patel

129
1
1
11

4

votes

1 answer

sparkpy insists that root scratch dir: /tmp/hive on HDFS should be writable

I am trying to run a pyspark program that access the hive server. The program terminates by throwing the error pyspark.sql.utils.AnalysisException: 'java.lang.RuntimeException: java.lang.RuntimeException: The root scratch dir: /tmp/hive on HDFS…

hive pyspark spark-hive

asked Jul 12 '19 at 04:17

Realdeo

449
6
19

4

votes

1 answer

Relative path in absolute URI Exception while accessing DynamoDB via Glue Data Catalogue in PySpark running on EMR

I am executing a pyspark application on AWS EMR that is configured to use AWS Glue Data Catalog as metastore. I have a table setup in AWS Glue that points to DynamoDB table. And now in my pyspark script, I am trying to access the Glue table. I am…

amazon-dynamodb apache-spark-sql amazon-emr spark-hive aws-glue-data-catalog

asked Apr 25 '19 at 16:33

ranjith

361
4
14

3

votes

0 answers

How to read sql files in pyspark?

I've been trying to run this code expecting it to create a table from a sql file which contains the tables schema and the values using pyspark. couldn't seem to understand the error. Please help me. --------------------SQL…

sql apache-spark pyspark apache-spark-sql spark-hive

asked Jul 03 '19 at 13:08

Vishal Ch

59
2
4

3

votes

2 answers

Spark sql saveAsTable create table append mode if new column is added in avro schema

I am using Spark sql DataSet to write data into hive. Its working perfectly if schema is same but if I change the avro schema, adding new column in between, its showing the error (Schema is provided from schema registry) Error running job streaming…

apache-spark spark-avro spark-hive

asked Feb 22 '18 at 09:19

Sumit G

436
8
21

3

votes

0 answers

Spark build failed

I have downloaded the spark source from apache site, then I built the source using maven. spark - version 1.6.3 hadoop - version 2.7.3 scala - version 2.10.4 I have used below command for build the project .build/mvn -Pyarn -Phadoop-2.7…

scala maven apache-spark spark-hive

asked Jan 19 '17 at 05:38

lucy

4,136
5
30
47

3

votes

1 answer

HiveContext createDataFrame not working on pySpark (jupyter)

I am doing an analysis on pySpark using the Jupyter notebooks. My code originally build dataframes using sqlContext = SQLContext(sc), but now I've switched to HiveContext since I will be using window functions. My problem is that now I'm getting a…

java python apache-spark pyspark spark-hive

asked Jul 13 '16 at 20:00

masta-g3

1,202
4
17
27

3

votes

1 answer

Spark CSV IOException Mkdirs failed to create file

TL;DR Spark 1.6.1 fails to write a CSV file using Spark CSV 1.4 on a standalone cluster with no HDFS with IOException Mkdirs failed to create file More details: I'm working on a Spark 1.6.1 application running it on a standalone cluster using a…

apache-spark apache-spark-sql spark-hive

asked Jun 14 '16 at 14:31

Gideon

2,211
5
29
47

3

votes

0 answers

Field delimiter of Hive table not recognized by spark HiveContext

I have created a hive external table stored as textfile partitioned by event_date Date. How do we have to specify a specific format of csv while reading in spark from Hive table ? The environment is 1. 1.Spark 1.5.0 - cdh5.5.1 Using Scala version…

apache-spark apache-spark-sql apache-spark-1.5 hivecontext spark-hive

asked Mar 25 '16 at 08:51

Shiva Achari

955
1
9
18

Questions tagged [spark-hive]