How do I get a working directory of Spark executor in Java?

Question

I need to know a current working directory URI/URL of Spark executor so I can copy some dependencies there before job executes. How do I get in Java ? What api should I call?

Spark executors are not long-lived processes, and you can't control where they run in the cluster — OneCricketeer, Oct 02 '17 at 22:24
@cricket_007 If YARN knows where to put archives for spark-submit so it could be done in code as well in the main jar — YuGagarin, Oct 02 '17 at 22:30
Right, that's what `SparkFiles` is for, as answered. But your definition of "in code" probably means the driver process, not the executors — OneCricketeer, Oct 02 '17 at 23:40

score 1 · Answer 1 · edited Oct 02 '17 at 22:03

1

Working directory is application specific so you want be able to get it before applications starts. It is best to use standard Spark mechanisms:

--jars / spark.jars - for JAR files.
pyFiles - for Python dependencies.
SparkFiles / --files / --archives - for everything else

edited Oct 02 '17 at 22:03

OneCricketeer

179,855
19
132
245

answered Oct 02 '17 at 22:00

user8710966

11
1

--archives does not always work. At least not on Azure Hdinsight I am using, so I have to resort to programmatic way until Microsoft fixes it or documents properly... – YuGagarin Oct 02 '17 at 22:17

How do I get a working directory of Spark executor in Java?

1 Answers1