I have a working HDFS and a running Spark framework in a remote server. I am running SparkR applications and hope to see the logs of the completed UI as well. I followed all the instructions here: Windows: Apache Spark History Server Config and was able to start the History Server on the server.
However, only when the super-user(person who started the name node of Hadoop) and who started the Spark processes fires a Spark application remotely, the logging takes places successfully in HDFS path & we are able to view the History Web UI of Spark as well.
When I run the same application from my user ID (remotely), though it shows on port 18080 a History Server is up and running, it does not log any of my applications.
I have been given read, write and execute access to the folder in HDFS.
The spark-defaults.conf file now looks like this:
spark.eventLog.enabled true
spark.history.fs.logDirectory hdfs://XX.XX.XX.XX:19000/user/logs
spark.eventLog.dir hdfs://XX.XX.XX.XX:19000/user/logs
spark.history.ui.acls.enable false
spark.history.fs.cleaner.enabled true
spark.history.fs.cleaner.interval 1d
spark.history.fs.cleaner.maxAge 7d
Am I missing out on some permissions or config settings somewhere(Spark? HDFS)?
Any pointers/tips to proceed from here would be appreciated.