In my notebook, I have setup a utility for logging so that I can debug DSX scheduled notebooks:
# utility method for logging
log4jLogger = sc._jvm.org.apache.log4j
LOGGER = log4jLogger.LogManager.getLogger("CloudantRecommender")
def info(*args):
# sends output to notebook
print(args)
# sends output to kernel log file
LOGGER.info(args)
Using it like so:
info("some log output")
If I check the log files I can see my logout is getting written:
! grep 'CloudantRecommender' $HOME/logs/notebook/*pyspark*
kernel-pyspark-20170105_164844.log:17/01/05 10:49:08 INFO CloudantRecommender: [Starting load from Cloudant: , 2017-01-05 10:49:08]
kernel-pyspark-20170105_164844.log:17/01/05 10:53:21 INFO CloudantRecommender: [Finished load from Cloudant: , 2017-01-05 10:53:21]
However, when the notebook runs as a scheduled job log output doesn't seem to be going to the kernel-pyspark-*.log file.
How can I write log output in DSX scheduled notebooks for debugging purposes?