I'm trying to connect to aws redis cluster from an emr cluster, I uploaded the jar driver to s3 and used this bootstrap action to copy the jar file to the cluster nodes:
aws s3 cp s3://sparkbcuket/spark-redis-2.3.0.jar /home/hadoop/spark-redis-2.3.0.jar
This is my connection test spark app:
import sys
from pyspark.sql import SparkSession
if __name__ == "__main__":
spark = SparkSession.builder\
.config("spark.redis.host", "testredis-0013.vb4vgr.00341.eu1.cache.amazonaws.com")\
.config("spark.redis.port", "6379")\
.appName("Redis_test").getOrCreate()
df = spark.read.format("org.apache.spark.sql.redis").option("key.column", "key").option("keys.pattern","*").load()
df.write.csv(path='s3://sparkbucket/',sep=',')
spark.stop()
when runing the application using this spark-submit :
spark-submit --deploy-mode cluster --driver-class-path /home/hadoop/spark-redis-2.3.0.jar s3://sparkbucket/testredis.py
i get the following error and not sure what i did wrong:
ERROR Client: Application diagnostics message: User application exited with status 1 Exception in thread "main" org.apache.spark.SparkException: Application application_1658168513779_0001 finished with failed status