0

test.py

import logging

# V1
logging.debug('%s before you %s', 'Look', 'leap!')

# V2
logging.debug('{} before you {}'.format('Look', 'leap!'))

I have the following code above.

Is it true when I execute with python test.py --log=INFO V2 will incur an extra CPU cycle because it will first format the string and then not output anything because of the debug level.

Is there anyway to use .format styling with logging and not incur this extra CPU cycle if the log ultimately isn't outputted?

EDIT: Part 1 -> Yes it's true via this testing code (Although very small)

old-format.py

import logging

# V1
for x in range(0, 100000):
    logging.debug('%s before you %s', 'Look', 'leap!')

new-format.py

import logging

# V2
for x in range(0, 100000):
    logging.debug('{} before you {}'.format('Look', 'leap!'))

python -m cProfile ~/Desktop/old-format.py -> 500464 function calls in 0.176 seconds

python -m cProfile ~/Desktop/new-format.py -> 600464 function calls in 0.237 seconds

ClickThisNick
  • 5,110
  • 9
  • 44
  • 69
  • https://stackoverflow.com/questions/13131400/logging-variable-data-with-new-format-string – snakecharmerb Dec 27 '18 at 21:47
  • Why do you need to use `format`? Yes, you can avoid the string interpolation if the logging level isn't relevant, if you use the correct method. – roganjosh Dec 27 '18 at 21:50
  • 1
    If you're worried about CPU cycles on this minute of a scale, you're using the wrong language man. Try C++ or Rust if that's a big concern. – the_endian Dec 27 '18 at 21:50
  • 1
    @the_endian "it's an interpreted language so I might as well do everything O(n^2)`". Deliberately taking it to the extreme, but there's no reason to be wanton with runtime. String interpolation is expensive and you could have tonnes of DEBUG messages in a tight loop. – roganjosh Dec 27 '18 at 21:53
  • @roganjosh I agree that you can squeeze as much perf out as possible and work in the same environment... Just making some suggestions if he found himself having to worry about a lot of this type of stuff. – the_endian Dec 27 '18 at 22:03
  • @roganjosh: `format` provides greater flexibility, since you can effectively define your own sub-dialect of the formatting mini-language. Aside from that, personally I prefer using `format` for everything else, and having the logging system force old style string formatting is a bit of a wart. – ShadowRanger Dec 27 '18 at 22:04
  • @ShadowRanger you already have my upvote, I didn't even consider that workaround might be possible – roganjosh Dec 27 '18 at 22:06

2 Answers2

6

Yes, it has to do extra work when you do it this way, creating strings that will never be used. That said, a small hack will allow you to switch to lazily formatted messages.

The Python folks provide a how to guide for switching though, either by using a wrapper class with a __str__ that is only invoked if the logging actually occurs, or by using a LoggerAdapter subclass wrapper for a logger that performs the necessary brace formatting lazily (so at time of use there is no extra kruft at all). The latter approach is the cleanest at time of use. The simple example from the how to guide is the following:

import logging

class Message(object):
    def __init__(self, fmt, args):
        self.fmt = fmt
        self.args = args

    def __str__(self):
        return self.fmt.format(*self.args)

class StyleAdapter(logging.LoggerAdapter):
    def __init__(self, logger, extra=None):
        super(StyleAdapter, self).__init__(logger, extra or {})

    def log(self, level, msg, *args, **kwargs):
        if self.isEnabledFor(level):
            msg, kwargs = self.process(msg, kwargs)
            self.logger._log(level, Message(msg, args), (), **kwargs)

logger = StyleAdapter(logging.getLogger(__name__))

def main():
    logger.debug('Hello, {}', 'world!')

if __name__ == '__main__':
    logging.basicConfig(level=logging.DEBUG)
    main()

In actual code, Message and StyleAdapter would likely be hidden in a separate module, leaving very little custom code in modules that use them.

ShadowRanger
  • 143,180
  • 12
  • 188
  • 271
3

Seriously, you're worrying about the wrong problem. Yes, the string will be formatted and maybe ultimately discarded. No, it doesn't matter.

How do I know?

  • Have you measured it? I bet you haven't and if you did you'd find the time to format the string is tiny, orders of magnitudes less than the time to write to the disk, and close to the time it takes to call the function.

  • If the logging is in a tight loop, called thousands of times, your log will be ugly and hard to use. So I bet it's not there, or won't be when you're done debugging.

  • If the logging isn't in a tight loop, the total formatting time for lost messages will be miniscule.

  • If the program is CPU-intensive, and spends much time in the Python interpreter, logging isn't why. You'll need to look for Python techniques/libraries that relieve the interpreter of work, or a language that does.

  • If misuse of the log module were a significant concern to many programmers, there would be warnings in the documentation or fixes in the code. You have too much company to have independently discovered a trivial optimization problem.

  • I have seen C programs that use heavy-duty debugging logs, in which printf contributes significant overhead. But that's because the rest of C is so lean. If C had garbage collection, dynamic name resolution, and list comprehension, etc., formatted I/O would start to look cheap.

The most efficient way to use the log module is the way that's easiest for you to program, because your time (I hope!) isn't cheap. If someone complains about your "wasted cycles", ask him to show how it accounts for even 1% of the running time of your program. Then you can return your focus to something that matters.

James K. Lowden
  • 7,574
  • 1
  • 16
  • 31
  • 1
    I don't buy this argument at all. I use loops for heuristics, maybe running 300K times till the problem is solved. I _may_ want to debug all the swaps I do in my scheduling problem. The loop is in python, the transformations are in numpy. You could easily double my runtime for DEBUG messages. I'm mostly on your side but I just don't like sweeping assertions when I know that python libraries are fine for my application, but I won't go and deliberately slow everything down. – roganjosh Dec 27 '18 at 22:23
  • @roganjosh, Yes, log messages can matter to performance. No, how you format them doesn't. In your 300K loop, are you producing debug statements, or not? If you are, the I/O is more expensive than the string formatting, and the formatting is unavoidable, right? If you're not, and you're still calling the log function, then how much does the string formatting cost as a % of the function call? I'm saying: not much, so doesn't matter. – James K. Lowden Jan 01 '19 at 22:08