Questions tagged [apache-spark-1.5.2]

Use for questions specific to Apache Spark 1.5.2. For general questions related to Apache Spark use the tag [apache-spark].

7 questions
9
votes
0 answers

Shuffle files missing

I'm getting random instances of shuffle files not being written (while using) Spark. 15/12/29 17:30:26 ERROR server.TransportRequestHandler: Error sending result ChunkFetchSuccess{streamChunkId=StreamChunkId{streamId=347837678000,…
Kirk Broadhurst
  • 27,836
  • 16
  • 104
  • 169
7
votes
1 answer

What is and how to control Memory Storage in Executors tab in web UI?

I use Spark 1.5.2 for a Spark Streaming application. What is this Storage Memory in Executors tab in web UI? How was this to reach 530 MB? How to change that value?
AkhilaV
  • 423
  • 3
  • 8
  • 18
3
votes
3 answers

Where is the Hive metastore warehouse directory path to store database/tables?

I have installed the Spark 1.5.2 build with Hive on a Linux machine. The default path for the Hive metastore warehouse directory is: /user/hive/warehouse. Is this a local path or a path to the HDFS? I ask this, because I couldn't search this path…
3
votes
0 answers

Spark job showing unknown in active stages and stuck

I am running a Spark job to calculate the interaction. After mapping I group by a key I want and Spark keep stuck in pending state without showing any error and unknown information of stage. I want to know what may cause this and how do I check it,…
giaosudau
  • 2,211
  • 6
  • 33
  • 64
1
vote
1 answer

Apache spark-shell error import jars

I have a local spark 1.5.2 (hadoop 2.4) installation on Windows as explained here. I'm trying to import a jar file that I created in Java using maven (the jar is jmatrw that I uploaded on here on github). Note the jar does not include a spark…
Donato Pirozzi
  • 759
  • 2
  • 10
  • 19
0
votes
1 answer

Pyspark performance issue spark 1.5.2 distributors cloudera

I experience some performance issues when executing following PySpark script: import os from pyspark.conf import SparkConf from pyspark.context import SparkContext from pyspark.sql.context import SQLContext, HiveContext from pyspark.sql.types import…
0
votes
0 answers

Spark streaming job does not return to driver

I have a spark steaming program with the following structure deployed in yarn-client mode with 4 executors. ListStream.foreachRDD(listJavaRDD -> { listJavaRDD.foreachPartition(tuple2Iterator -> { while (tuple2Iterator.hasNext()) { …