Questions tagged [apache-spark-2.1.1]
10 questions
3
votes
3 answers
Unable to load pyspark inside virtualenv
I had installed pyspark in a python virtualenv. I have also installed jupyterlab which was newly released http://jupyterlab.readthedocs.io/en/stable/getting_started/installation.html in the virtualenv. I was unable to fire pyspark within a…

Pranay Aryal
- 5,208
- 4
- 30
- 41
2
votes
0 answers
Spark2 Datetime lookup efficient datastructure
I have a Spark application with records which contain the following information:
Hash - Some unique identifier for an item
Location - The location of the item
From - The date on which the item was first seen in location
To - Null if still there or…

Jon Taylor
- 7,865
- 5
- 30
- 55
2
votes
0 answers
Assing spark-deep-learning external jar to spark with python on amazon-EMR
I've been trying to get the spark-deep-learning library working on my EMR cluster to be able to read images in parallel with Python 2.7. I have been searching for this for quite some time now and I have failed to reach a solution. I have tried…

nmh26
- 31
- 2
1
vote
0 answers
Uneven distribution of tasks among the spark executors
I am using spark-streaming 2.2.1 on production and in this application i read the data from RabbitMQ and do the further processing and finally save it in the cassandra. So, i am facing this strange issue where number of tasks are not evenly…

Naresh
- 5,073
- 12
- 67
- 124
1
vote
1 answer
Is it possible to expose/add your custom APIs to the existing Spark's driver REST endpoints?
Spark exposes certain API endpoints (usually mounted at /api/v1). Is their someway to expose custom end-points using the same spark server?
( Using Spark 2.1.1 , Structured Streaming )

RichieNotSoRich
- 31
- 4
1
vote
0 answers
Writing the output of Batch Queries to Kafka for Spark version 2.1.1
Can somebody give me pointers on how can I load the output of Batch Queries to kafka.
I researched a lot in stackoverflow and other articles but I was unable to find anything for Spark 2.1.1 .
For higher versions of spark, there is an easy way to…

user2175104
- 67
- 8
1
vote
0 answers
Issue with try and except block in pyspark
I use spark-2.1 .Below is my code
delta="insert overwrite table schema1.table1 select * from schema2.table2"
try:
spark.sql(delta)
except Exception as e:
spark.sql("drop table schema2.table2")
…

aswinkarthikeyan
- 11
- 1
- 2
0
votes
1 answer
Saved Model : LinearRegression does not seem to work
I am using Azure and Spark version is '2.1.1.2.6.2.3-1
I have saved my model using the following command:
def fit_LR(training,testing,adl_root_path,location,modelName):
training.cache()
lr = LinearRegression(featuresCol =…

E B
- 1,073
- 3
- 23
- 36
0
votes
1 answer
Pyspark read data - java.util.NoSuchElementException: spark.sql.execution.pandas.respectSessionTimeZone
I have a program that is working in command line, but I'm trying to set up PyCharm to test its functionalities individually.
I must have configured something wrong, because whenever I try to read any data (whether it's a hive query or a csv), I get…

Laurent
- 1,914
- 2
- 11
- 25
0
votes
2 answers
How to use a window function to count day of week occurrences in Pyspark 2.1
With the below pyspark dataset (2.1), how to you use a windowing function that would count the number of times the current record's day of week appeared int he last 28 days.
Example Data frame:
from pyspark.sql import functions as F
df =…

Micah Pearce
- 1,805
- 3
- 28
- 61