0

I am trying to submit a spark job to a kubernetes cluster in cluster mode from a client in the cluster with --packages attribute to enable dependencies are downloaded by driver and executer but it is not working. It refers to path on submitting client. ( kubectl proxyis on )

here it the the submit options

   /usr/local/bin/spark-submit \
    --verbose \
    --master=k8s://http://127.0.0.1:8001 \
    --deploy-mode cluster \
    --class org.apache.spark.examples.SparkPi \
    --conf spark.kubernetes.authenticate.driver.serviceAccountName=spark \
    --conf spark.kubernetes.namespace=spark \
    --conf spark.kubernetes.container.image= <...> \
    --conf spark.executor.instances=2 \
    --conf spark.kubernetes.pyspark.pythonVersion=3 \
    --conf spark.kubernetes.driver.secretKeyRef.AWS_ACCESS_KEY_ID=datazone-s3-secret:AWS_ACCESS_KEY_ID \
    --conf spark.kubernetes.driver.secretKeyRef.AWS_SECRET_ACCESS_KEY=datazone-s3-secret:AWS_SECRET_ACCESS_KEY \
    --packages  com.amazonaws:aws-java-sdk:1.7.4,org.apache.hadoop:hadoop-aws:2.7.3 \
    s3.py 10

On the logs I can see that packages are referring my local file system.

Spark config:
(spark.kubernetes.namespace,spark)
(spark.jars,file:///Users/<my username>/.ivy2/jars/com.amazonaws_aws-java-sdk-1.7.4.jar,file:///Users/<my username>/.ivy2/jars/org.apache.hadoop_hadoop-aws-2.7.3.jar,file:///Users/<my username>/.ivy2/jars/joda-time_joda-time-2.10.5.jar, ....

Did someone face this problem?

  • Does this answer your question? [Spark-Submit: --packages vs --jars](https://stackoverflow.com/questions/51434808/spark-submit-packages-vs-jars) – Lamanus Mar 15 '20 at 02:17
  • no, I have just used but still refer my client local maven cache. – Baris Cekic Mar 15 '20 at 22:21

0 Answers0