Questions tagged [spark-redis]
26 questions
3
votes
1 answer
Spark/Scala parallel write to redis
Is it possible to write to Redis in parallel from spark?
(Or: how to write tens of thousands of keys/lists quickly from spark)
Currently, I'm writing to Redis by key in sequence, and it's taking forever. I need to write about 90000 lists (of length…

BBischof
- 310
- 2
- 13
2
votes
2 answers
How to setup jar configs in databricks for redis connections
I have installed the following jar in databricks "com.redislabs:spark-redis_2.12:2.5.0". And trying create a spark session with the respective authentications
Below is the code where I create a spark session with creds
redis=…

Vamsi Nimmala
- 497
- 1
- 7
- 19
2
votes
0 answers
java.net.SocketException:Connection reset exception when trying to load/write the data into redis from spark sql java application
I am trying to load/write the data from Redis cache in my spark-sql java based application.
Here is my code:
SparkSession sprk = SparkSession.builder().appName("Bulk processing").master("local").config("spark.redis.host", "127.0.0.1")
…

Siva
- 21
- 2
2
votes
1 answer
Read specific key from redis using pyspark
I am trying to read a specific key from Redis using pyspark.
As per documentation, I haven't found any specific command to read a particular key. Using the below code I can read all data from Redis:
testid =…

Abhishek Vij
- 53
- 6
2
votes
1 answer
How to add jar to Spark in Pycharm
I want to debug Spark code in PyCharm because it is easier to debug. But I need to add a spark-redis.jar otherwise Failed to find data source: redis
The code to connect to redis is
spark = SparkSession \
.builder \
…

Litchy
- 623
- 7
- 23
2
votes
0 answers
spark-redis conector for scala 2.11
I'am looking for a solution where I can save my kafka streaming RDD to Redis with zscore and in Append Mode. Do we have any Connector to do this - Tried spark redis connector by RedisLabs but it is only compatible with Scala 2.10 . There is one more…

Pinnacle
- 165
- 2
- 14
2
votes
0 answers
Pyspark / Redis - Data aggregation over time
I am streaming a lot of data (200k + events / batch of 3sec) from Kafka using KafkaUtils Pyspark implementation.
I receive live data with :
a sessionID
an ip
a state
What I am doing for now with a basic Spark/Redis implementation is the…

Orelus
- 963
- 1
- 13
- 23
1
vote
1 answer
adding jar driver to emr-6.7.0 spark
I'm trying to connect to aws redis cluster from an emr cluster, I uploaded the jar driver to s3 and used this bootstrap action to copy the jar file to the cluster nodes:
aws s3 cp s3://sparkbcuket/spark-redis-2.3.0.jar…

billie class
- 77
- 4
1
vote
0 answers
Why the performance of Redis is worse than Hive?
I'm using Hadoop to work on a big data project.
I can use spark to send some SQL command to Hive.
Since this process is slow, I try to write my data into Redis which is an open-source database and use spark to query my data from this database to…

Musicmath
- 11
- 2
1
vote
1 answer
How to install jars related to spark-redis in databricks cluster?
I am trying to connect to Azure cache for redis from databricks .
I have installed this package com.redislabs:spark-redis:2.3.0 from maven package in databricks. I have created a spark session with below…

Sai
- 31
- 2
1
vote
1 answer
spark-redis exception: Caused by: redis.clients.jedis.exceptions.JedisConnectionException: java.net.SocketTimeoutException: Read timed out
I am trying to insert data to redis (Azure Cache for Redis) through spark.
There are around 700 million rows and I am using spark-redis connector to insert data. It fails after sometime throwing this error. I am able to insert some rows but after…

Deep
- 11
- 2
1
vote
1 answer
Read data saved by spark redis using Java
I using spark-redis to save Dataset to Redis.
Then I read this data by using Spring data redis:
This object I save to redis:
@Getter
@Setter
@AllArgsConstructor
@NoArgsConstructor
@Builder
@RedisHash("collaborative_filtering")
public class…

Bằng Ngô Duy
- 23
- 6
1
vote
0 answers
Performance issue while converting pyspark dataframe to JSON
I would like to insert pyspark dataframe content to Redis in an effective way. Trying a couple methods but none of them are giving expected results.
Converting df to json takes 30 seconds. The goal is to SET the json payload into Redis cluster for…

user2407164
- 71
- 5
1
vote
1 answer
Spark-redis: dataframe writing times too slow
I am an Apache Spark/Redis user and recently I tried spark-redis for a project. The program is generating PySpark dataframes with approximately 3 million lines, that I am writing in a Redis database using the command
df.write \
…

holypriest
- 191
- 8
1
vote
0 answers
Kafka Spark Stream save directly to Redis
I'am using Scala to get kafkaStream and want to insert this data directly to Redis. What is the best optimum strategy to do so ?
val kafkaStream = KafkaUtils.createStream(ssc, "192.168.0.40:2181", "group", topics,…

Pinnacle
- 165
- 2
- 14