0

I'm trying to run spark streaming job on DC/OS platform and I've got issue with kafka packages. When I'm trying to include Kafka library and its dependencies (jar file downloaded from Maven, added to artifactory and read from there) with the use of --jars mode as follows:

dcos spark run --submit-args"--jars https://../../../spark-streaming 2.11-2.2.1.jar --conf spark.executor.memory=2g --py-files=https://../../../libs.zip,https://../../../test.py etc"

it seems that file libs.zip, test.py are correctly read but .jar file is omitted.

Any idea why? Is there any workaround for this kind of issue?

Thanks in advance for any help!

Ali AzG
  • 1,861
  • 2
  • 18
  • 28

1 Answers1

0

I'm not sure why the dcos spark submit command doesn't support --jar option, but you can use the spark.mesos.uris property to download artifacts to the working directory of a Spark driver and executor.

I'm not sure how your Python-based Spark job is going to use JARs, but you may need setting the spark.executor.extraClassPath and spark.driver.extraClassPath configuration property as well.

Andrey Dyatlov
  • 1,628
  • 1
  • 10
  • 12