21

Is it advantageous to call logging functions with format string + args list vs. formatting inline?

I've seen (and written) logging code that uses inline string formatting:

logging.warn("%s %s %s" % (arg1, arg2, arg3))

and yet I assume it's better (performance-wise, and more idiomatic) to use:

logging.warn("%s %s %s", arg1, arg2, arg3)

because the second form avoids string formatting operations prior to invoking the logging function. If the current logging level would filter out the log message, no formatting is necessary, reducing computing time and memory allocations.

Am I on the right track here, or have I missed something?

Inactivist
  • 9,997
  • 6
  • 29
  • 41
  • 2
    I think you're on the right track. – mgilson Aug 14 '12 at 15:32
  • Do you have a reference about the timing of the string formatting taking place at different times? I would have expected both take place before the call to `warn` is made. Not disputing what you are saying, just curious to find out more. – Levon Aug 14 '12 at 15:42
  • Well, in the first form, we're doing string formatting (*and* tuple creation) *prior* to the call to `logging.warn` - in the second case, we're just passing a list of args to `logging.warn` (which is merely tuple creation?) - so we avoid a string formatting operation at the invocation. As I mentioned in the question, I assume that if the current logging level would filter out the log message, no formatting would be done, again avoiding the string formatting operation. My conjecture is this *should* save processing time and memory allocations. – Inactivist Aug 14 '12 at 15:48
  • @Inactivist Ah .. ok, thanks for the additional information. I guess you can draw your own conclusions from the timings I provided with the information you already have. I guess another approach might be to perhap time the actual calls and see how they compare? – Levon Aug 14 '12 at 15:50
  • @Levon I'm working on testing actual logging invocations via timeit. I'll report back when I have useful results. – Inactivist Aug 14 '12 at 15:51
  • @Inactivist great .. it will be interesting to know. – Levon Aug 14 '12 at 15:52
  • 3
    possible duplicate of [Lazy logger message string evaluation](http://stackoverflow.com/questions/4148790/lazy-logger-message-string-evaluation) – Steven Rumbalski Aug 14 '12 at 16:17
  • @Levon: It does matter. See my answer, below, or user1202136's answer. – Inactivist Aug 14 '12 at 19:46
  • @Inactivist I did see both .. so this was your question, but I learned something new too - thanks – Levon Aug 14 '12 at 20:39

3 Answers3

23

IMHO, for messages that are very likely to be displayed, such as those given to error or warn it does not make much of a difference.

For messages that are less likely displayed, I would definitely go for the second version, mainly for performance reasons. I often give large objects as a parameter to info, which implement a costly __str__ method. Clearly, sending this pre-formatted to info would be a performance waste.

UPDATE

I just checked the source code of the logging module and, indeed, formatting is done after checking the log level. For example:

class Logger(Filterer):
    # snip
    def debug(self, msg, *args, **kwargs):
        # snip
        if self.isenabledfor(debug):
            self._log(debug, msg, args, **kwargs)

One can observe that msg and args are untouched between calling log and checking the log level.

UPDATE 2

Spired by Levon, let me add some tests for objects that have a costly __str__ method:

$ python -m timeit -n 1000000 -s "import logging" -s "logger = logging.getLogger('foo')" -s "logger.setLevel(logging.ERROR)" "logger.warn('%s', range(0,100))"
1000000 loops, best of 3: 1.52 usec per loop
$ python -m timeit -n 1000000 -s "import logging" -s "logger = logging.getLogger('foo')" -s "logger.setLevel(logging.ERROR)" "logger.warn('%s' % range(0,100))"
1000000 loops, best of 3: 10.4 usec per loop

In practice, this could give a fairly high performance boost.

user1202136
  • 11,171
  • 4
  • 41
  • 62
  • 1
    Another time this is likely to make a difference is if the logging is in a performance critical loop. – Steven Rumbalski Aug 14 '12 at 16:18
  • 4
    That's what I expected based on observation. Thanks for going source diving! – Inactivist Aug 14 '12 at 16:19
  • 1
    Python 3.6 f-string formatting shows the same time penalty: `python -m timeit -n 1000000 -s "import logging" -s "logger = logging.getLogger('foo')" -s "logger.setLevel(logging.ERROR)" -s "x=list(range(0,100))" "logger.warn(f'{x}')"` yields 10 usec per loop, same as with classic style string formatting. Passing the formatting parameters directly to the log statement yields 2.12 usec. – Dustin Wyatt Apr 23 '17 at 16:30
9

In case this is helpful, here is a quick timing test for just the two formatting options:

In [61]: arg1='hello'
In [62]: arg2='this'
In [63]: arg3='is a test'

In [70]: timeit -n 10000000 "%s %s %s" % (arg1, arg2, arg3)
10000000 loops, best of 3: 284 ns per loop

In [71]: timeit -n 10000000  "%s %s %s", arg1, arg2, arg3
10000000 loops, best of 3: 119 ns per loop

seems to give the second approach the edge.

Levon
  • 138,105
  • 33
  • 200
  • 191
  • And what's about subsequent call of `%` withing `log` function? – Vladimir Aug 14 '12 at 15:35
  • So, the most important lesson I can learn from your answer: timeit is my friend! :D – Inactivist Aug 14 '12 at 15:36
  • @Inactivist I've found it to be useful .. often to verify (or contradict) my assumptions about what's speedy and what's not :) – Levon Aug 14 '12 at 15:37
  • Is there a similar *simple* way to determine differences in memory consumption between the two variants? – Inactivist Aug 14 '12 at 15:38
  • @Inactivist I've wondered about this too, and unfortunately have *not* found one. If there's one, I'd love to know though. – Levon Aug 14 '12 at 15:38
  • @Vladimir I am not sure, so therefore I qualified my answer. – Levon Aug 14 '12 at 15:40
  • Do you mean it gives the *second* approach the edge? Also note that here you're timing tuple creation vs. tuple creation and string interpolation. I'm not sure how that actually translates to what the `logging` module actually does with the arguments, etc though. – mgilson Aug 14 '12 at 15:43
  • @mgilson yes, the 2nd -- thanks! Re the other point, I asked OP about the timing issue in a comment below OP's post - I would have expected both strings to have been formatted before they are sent to `warn()` - is that not the case? So in that case the call to `warn()` and subsequent would be constant. – Levon Aug 14 '12 at 15:45
  • @levon: There are memory profiling tools for Python, sadly, none I've found are as easy to use as timeit! – Inactivist Aug 14 '12 at 15:45
  • I'm not sure what I really meant by my second comment. If the string gets logged, the string interpolation must happen at some point. You gain from approach 2 if you don't actually log the message (then the interpolation can be skipped). I guess I'm wondering how the differences in timing for a *function call* with `*args` goes depending one whether or not any `*args` are actually given. – mgilson Aug 14 '12 at 15:50
  • @mgilson Yes, in fact I just suggested to OP above that perhaps timing the calls would be worthwhile too. – Levon Aug 14 '12 at 15:52
6

Avoiding inline string formatting does save some time if the current logging level filters the log message (as I expected) -- but not much:

In [1]: import logging

In [2]: logger = logging.getLogger('foo')

In [3]: logger.setLevel(logging.ERROR)

In [4]: %timeit -n 1000000 logger.warn('%s %s %s' % ('a', 'b', 'c'))
1000000 loops, best of 3: 1.09 us per loop

In [12]: %timeit -n 1000000 logger.warn('%s %s %s', 'a', 'b', 'c')
1000000 loops, best of 3: 946 ns per loop

So, as user1202136 pointed out, the overall performance difference depends on how long it takes to format the string (which could be significant depending on the cost of calling __str__ on arguments being passed to the logging function.)

Community
  • 1
  • 1
Inactivist
  • 9,997
  • 6
  • 29
  • 41