I'm building an Apache Spark Streaming application and cannot make it log to a file on the local filesystem when running it on YARN. How can achieve this?
I've set log4.properties
file so that it can successfully write to a log file in /tmp
directory on the local file system (shown below partially):
log4j.appender.file=org.apache.log4j.FileAppender
log4j.appender.file.File=/tmp/application.log
log4j.appender.file.append=false
log4j.appender.file.layout=org.apache.log4j.PatternLayout
log4j.appender.file.layout.ConversionPattern=%d{yyyy-MM-dd HH:mm:ss} %-5p %c{1}:%L - %m%n
When I run my Spark application locally by using the following command:
spark-submit --class myModule.myClass --master local[2] --deploy-mode client myApp.jar
It runs fine and I can see that log messages are written to /tmp/application.log
on my local file system.
But when I run the same application via YARN, e.g.
spark-submit --class myModule.myClass --master yarn-client --name "myModule" --total-executor-cores 1 --executor-memory 1g myApp.jar
or
spark-submit --class myModule.myClass --master yarn-cluster --name "myModule" --total-executor-cores 1 --executor-memory 1g myApp.jar
I cannot see any /tmp/application.log
on the local file system of the machine that runs YARN.
What am I missing.