19

I've been struggled with multiprocessing logging for some time, and for many reasons.

One of my reason is, why another get_logger.

Of course I've seen this question and it seems the logger that multiprocessing.get_logger returns do some "process-shared locks" magic to make logging handling smooth.

So, today I looked into the multiprocessing code of Python 2.7 (/multiprocessing/util.py), and found that this logger is just a plain logging.Logger, and there's barely any magic around it.

Here's the description in Python documentation, right before the get_logger function:

Some support for logging is available. Note, however, that the logging package does not use process shared locks so it is possible (depending on the handler type) for messages from different processes to get mixed up.

So when you use a wrong logging handler, even the get_logger logger may go wrong? I've used a program uses get_logger for logging for some time. It prints logs to StreamHandler and (seems) never gets mixed up.

Now My theory is:

  1. multiprocessing.get_logger don't do process-shared locks at all
  2. StreamHandler works for multiprocessing, but FileHandler doesn't
  3. major purpose of this get_logger logger is for tracking processes' life-cycle, and provide a easy-to-get and ready-to-use logger that already logs process's name/id kinds of stuff

Here's the question:

Is my theory right?

How/Why/When do you use this get_logger?

Community
  • 1
  • 1
tdihp
  • 2,329
  • 2
  • 23
  • 40

2 Answers2

7

Yes, I believe you're right that multiprocessing.get_logger() doesn't do process-shared locks - as you say, the docs even state this. Despite all the upvotes, it looks like the question you link to is flawed in stating that it does (to give it the benefit of doubt, it was written over a decade ago - so perhaps that was the case at one point).

Why does multiprocessing.get_logger() exist then? The docs say that it:

Returns the logger used by multiprocessing. If necessary, a new one will be created.

When first created the logger has level logging.NOTSET and no default handler. Messages sent to this logger will not by default propagate to the root logger.

i.e. by default the multiprocessing module will not produce any log output since its logger's logging level is set to NOTSET so no log messages are produced.

If you were to have a problem with your code that you suspected to be an issue with multiprocessing, that lack of log output wouldn't be helpful for debugging, and that's what multiprocessing.get_logger() exists for - it returns the logger used by the multiprocessing module itself so that you can override the default logging configuration to get some logs from it and see what it's doing.

Since you asked for how to use multiprocessing.get_logger(), you'd call it like so and configure the logger in the usual fashion, for example:

logger = multiprocessing.get_logger()
formatter = logging.Formatter('[%(levelname)s/%(processName)s] %(message)s')
handler = logging.StreamHandler()
handler.setFormatter(formatter)
logger.addHandler(handler)
logger.setLevel(logging.INFO)

# now run your multiprocessing code

That said, you may actually want to use multiprocessing.log_to_stderr() instead for convenience - as per the docs:

This function performs a call to get_logger() but in addition to returning the logger created by get_logger, it adds a handler which sends output to sys.stderr using format '[%(levelname)s/%(processName)s] %(message)s'

i.e. it saves you needing to set up quite so much logging config yourself, and you can instead start debugging your multiprocessing issue with just:

logger = multiprocessing.log_to_stderr()
logger.setLevel(logging.INFO)

# now run your multiprocessing code

To reiterate though, that's just a normal module logger that's being configured and used, i.e. there's nothing special or process-safe about it. It just lets you see what's happening inside the multiprocessing module itself.

emmagordon
  • 1,222
  • 8
  • 17
  • So also add this in your answer , that multiprocessing.get_logger() is not usable for logging across multiple processes ! Lots of developers think that this method will give them a process-safe logging faculty and can be used to use it beside a FileHandler to log concurrently across the processes , so they except something like Process shared Locks and etc. – AmirHmZ May 29 '20 at 23:13
  • And clarify that multiprocessing.get_logger() is a normal logger just like the others! Today you can see many modules that give customizable log faculty to see what is exactly happening inside the module , so the logger which multiprocessing.get_logger() returns is just like them. – AmirHmZ May 29 '20 at 23:20
  • Thanks - I've added a concluding paragraph to summarise that to make sure that's clear to readers. – emmagordon Jun 01 '20 at 21:08
3

This answer is not about get_logger specifically, but perhaps you can use the approach suggested in this post? Note that the QueueHandler/QueueListener classes are available for earlier Python versions via the logutils package (available on PyPI, too).

Vinay Sajip
  • 95,872
  • 14
  • 179
  • 191
  • 1
    Sorry I can't visit blobspot because where I am, I know your post is about multiprocessing-friendly logging, thanks. About that, I actually intend to use zmq.log for my project uses zmq.:-) – tdihp Nov 23 '12 at 14:02