Spark-submit python logging in executor

Asked Jan 19 '17 at 22:26

Active Jan 19 '17 at 22:26

Viewed 863 times

I am using Python to implement spark jobs. We wanted to get the python logging output from the application into Spark history server. So we used the method outlined here:

PySpark logging from the executor

However the problem is that, since the yarn_logger initialization is only happening in the driver, the executors still run with python logging level of WARNING which means no logs show up for the executor.

In my driver I do the following:

if __name__=='__main__':

    # initialize logging in main
    yarn_logger.YarnLogger.setup_logger()

And in the other python files, I just intialize python logging module:

import logging
LOG = logging.getLogger(__name__)

But this only results in logs that occur in the driver context from showing up.

How do I architect it so that, yarn_logger is only initialized once per process, no matter whether the application is running in local mode or cluster mode? I can of course, initialize yarn_logger in each python module of my application, but it might cause it to initialize multiple times in the application if I run it in local mode.

edited May 23 '17 at 11:45

Community

asked Jan 19 '17 at 22:26

feroze

7,380
7
40
57

Any Update on this? – Akriti Feb 16 '21 at 13:38

Spark-submit python logging in executor

0 Answers0