I am trying to run map reduce job. But I am unable to find my log files when I run this job. I am using hadoop streaming job to perform map reduce and I am using Python. I am using python's logging module to log messages. When I run this on a file by using "cat" command, the log file is created.
cat file | ./mapper.py
But when I run this job via hadoop, I am unable to find the log file.
import os,logging
logging.basicConfig(filename="myApp.log", level=logging.INFO)
logging.info("app start")
##
##logic with log messages
##
logging.info("app complete")
But I cannot find the myApp.log file anywhere. Is the log data stored anywhere or does hadoop ignore the application logging complete. I have searched for my log items in the userlogs folder too, but it doesn't look like my log items are there.
I work with vast amounts of data where random items are not making to the next stage, this is a very big issue on our side, so I am trying to find a way to use logging to debug my application.
Any help is appreciated.