2

I can see all hadoop logs on my /usr/local/hadoop/logs path

but where can I see application level logs? for example :

mapper.py

import logging

def main():
    logging.info("starting map task now")
    // -- do some task -- //
    print statement  

reducer.py

import logging
def main():
    for line in sys.stdin:
        logging.info("received input to reducer - " + line)  
        // -- do some task -- //
        print statement

Where I can see logging.info or related log statements of my application?
I am using Python and using hadoop-streaming

Thank you

daydreamer
  • 87,243
  • 191
  • 450
  • 722

2 Answers2

5

Hadoop gathers stderr, which can be viewed on hadoop map/reduce status site. So you can just write to stderr.

Sergey Zyuzin
  • 3,754
  • 1
  • 24
  • 17
  • just write to stderr: `import sys; print >> sys.stderr, 'spam'` or other alternatives at http://stackoverflow.com/questions/5574702/how-to-print-to-stderr-in-python – Nickolay Jun 03 '15 at 13:11
2

Hadoop streaming uses STDIN/STDOUT for passing the key/value pairs between the mappers and reducers, so the log messages have to be written to a specific log file - check the sample code and the python logging documentation for more details. This Query might also help.

Community
  • 1
  • 1
Praveen Sripati
  • 32,799
  • 16
  • 80
  • 117
  • 1
    Thank you Praveen, I added the logging.warn statements and it started to accumulate logs in /usr/local/hadoop/logs/userlogs/dir/stderr file – daydreamer Oct 26 '11 at 05:04
  • for the record (five years later...), the default logging level is `logging.WARNING`; to emit info-level statements in the code above, the root (default) logger's level must be changed: `logging.getLogger().setLevel(logging.INFO)` – slushy Dec 23 '16 at 15:25