I am running a spark application in 'local' mode. It's checkpointing correctly to the directory defined in the checkpointFolder config. However, there are two issues that I am seeing that are causing some disk space issues.
1) As we have multiple users running the application, the checkpoint folder on server is created by the first user executing it, which causes other user's run to fail due to permissions issue on the OS. Is there a way to provide a relative path in the checkpointFolder, for example checkpointFolder=~/spark/checkpoint?
2) I have used the spark.worker.cleanup.enabled=true config to cleanup the checkpoint folder after the run, but don't see that happening. Is there an alternate way of cleaning it up through the app, instead of resorting to some cron job?