0

I want to submit a pyspark task. And some .py files in different folders.Especially I want put configuration files and common tools in only one folder. But when I submit a pyspark task, I just know --py-files param, so how to submit folders? My code struct likes:

--conf folder
|  --origin.conf
|  --scenes.conf
--tools folder
|  --utils.py
|  --vali.py
-- other fodlsers...
Peng He
  • 2,023
  • 5
  • 17
  • 24

2 Answers2

4
  • create Python package to organize the code
  • zip package or create egg file
  • submit your app passing egg or zip file to --py-files / sc.pyFiles
0

This link from Cloudera has some examples of distributing python packages to Spark executors Running Spark Python Applications

WhyNot
  • 47
  • 4