23

These are two very popular ways of formatting a string in Python. One is using a dict:

>>> 'I will be %(years)i on %(month)s %(day)i' % {'years': 21, 'month': 'January', 'day': 23}
'I will be 21 on January 23'

And the other one using a simple tuple:

>>> 'I will be %i on %s %i' % (21, 'January', 23)
'I will be 21 on January 23'

The first one is way more readable, but the second one is faster to write. I actually use them indistinctly.

What are the pros and cons of each one? regarding performance, readability, code optimization (is one of them transformed to the other?) and anything else you would think is useful to share.

juliomalegria
  • 24,229
  • 14
  • 73
  • 89
  • 2
    Good question. I personally find the second one more readable. Code readability is brain-dependent. On a side note, somehow curly braces always ruin it for me. – milancurcic Dec 06 '11 at 05:58
  • 1
    When using `format()` method you do not need to choose between these two ways of passing parameters - you can use both at the same time (see my answer). – Tadeck Dec 06 '11 at 06:22

3 Answers3

22

Why format() is more flexible than % string operations

I think you should really stick to format() method of str, because it is the preferred way to format strings and will probably replace string formatting operation in the future.

Furthermore, it has some really good features, that can also combine position-based formatting with keyword-based one:

>>> string = 'I will be {} years and {} months on {month} {day}'
>>> some_date = {'month': 'January', 'day': '1st'}
>>> diff = [3, 11] # years, months
>>> string.format(*diff, **some_date)
'I will be 3 years and 11 months on January 1st'

even the following will work:

>>> string = 'On {month} {day} it will be {1} months, {0} years'
>>> string.format(*diff, **some_date)
'On January 1st it will be 11 months, 3 years'

There is also one other reason in favor of format(). Because it is a method, it can be passed as a callback like in the following example:

>>> data = [(1, 2), ('a', 'b'), (5, 'ABC')]
>>> formatter = 'First is "{0[0]}", then comes "{0[1]}"'.format
>>> for item in map(formatter, data):
    print item


First is "1", then comes "2"
First is "a", then comes "b"
First is "5", then comes "ABC"

Isn't it a lot more flexible than string formatting operation?

See more examples on documentation page for comparison between % operations and .format() method.

Comparing tuple-based % string formatting with dictionary-based

Generally there are three ways of invoking % string operations (yes, three, not two) like that:

base_string % values

and they differ by the type of values (which is a consequence of what is the content of base_string):

  • it can be a tuple, then they are replaced one by one, in the order they are appearing in tuple,

    >>> 'Three first values are: %f, %f and %f' % (3.14, 2.71, 1)
    'Three first values are: 3.140000, 2.710000 and 1.000000'
    
  • it can be a dict (dictionary), then they are replaced based on the keywords,

    >>> 'My name is %(name)s, I am %(age)s years old' % {'name':'John','age':98}
    'My name is John, I am 98 years old'
    
  • it can be a single value, if the base_string contains single place where the value should be inserted:

    >>> 'This is a string: %s' % 'abc'
    'This is a string: abc'
    

There are obvious differences between them and these ways cannot be combined (in contrary to format() method which is able to combine some features, as mentioned above).

But there is something that is specific only to dictionary-based string formatting operation and is rather unavailable in remaining three formatting operations' types. This is ability to replace specificators with actual variable names in a simple manner:

>>> name = 'John'
>>> surname = 'Smith'
>>> age = 87
# some code goes here
>>> 'My name is %(surname)s, %(name)s %(surname)s. I am %(age)i.' % locals()
'My name is Smith, John Smith. I am 87.'

Just for the record: of course the above could be easily replaced by using format() by unpacking the dictionary like that:

>>> 'My name is {surname}, {name} {surname}. I am {age}.'.format(**locals())
'My name is Smith, John Smith. I am 87.'

Does anyone else have an idea what could be a feature specific to one type of string formatting operation, but not to the other? It could be quite interesting to hear about it.

Tadeck
  • 132,510
  • 28
  • 152
  • 198
  • 3
    `.format()` is awesome. I didn't know about the `*diff` expansion, so thanks for the tip ;) – Blender Dec 06 '11 at 06:26
  • both answers are great, I didn't know much about `format()`. However, any information about `%` formatting? (that was the main point) – juliomalegria Dec 06 '11 at 15:22
  • 1
    @julio.alegria: I have updated my answer with the example of the way you can use dictionary-based string operations - the way that can be very useful and is not easily replaced by tuple-based operations. – Tadeck Dec 06 '11 at 16:39
  • 1
    `locals()`! genius! didn't know about that either. One feature that `%` has and `format()` doesn't that comes to my mind is "conversions" (I'm not sure if that's the correct name), for instance: `%2.6f`.. that's really usuful some times. – juliomalegria Dec 06 '11 at 17:30
  • @julio.alegria: Thanks! When it comes to "conversions", I believe you are talking about that (from M. Lutz's "_Learning Python. 4th Edition_"): "_Floating-point numbers support the same type codes and formatting specificity in formatting method calls as in % expressions._". In other words eg. both `'%2.6f' % .123456789123` and `'{:2.6f}'.format(.123456789123)` give you `'0.123457'`. So, basically, `format()` has this specific feature of `%` operations. – Tadeck Dec 06 '11 at 18:11
  • @julio.alegria: I have just found in the same book something that works with `format()`, but I do not see its "`%`" counterpart: `'{0:,d}'.format(999999999999)` will produce `999,999,999,999`. There is also support for displaying in binary format (`'{:b}'.format(16)` will give `'10000'`). But I hope we will find more differences :) – Tadeck Dec 06 '11 at 18:33
  • I found tons of interesting features in the link you passed me in your answer! man, that `format()` is a fantastic method! now I understand why nobody said anything about `%` – juliomalegria Dec 06 '11 at 19:04
  • That `locals()` solution is beautiful. I just realized you included it! – Blender Apr 07 '12 at 23:52
  • @Blender: Thanks. In some cases it may be confusing, I believe. In cases where it is not.necessary (eg. you have multiple local variables but use one 1-2 of them), it should be avoided. – Tadeck Apr 09 '12 at 08:05
20

I'm not exactly answering your question, but just thought it'd be nice to throw format into your mix.

I personally prefer the syntax of format to both:

'I will be {years} on {month} {day}'.format(years=19, month='January', day=23)

If I want something compact, I just write:

'I will be {} on {} {}'.format(19, 'January', 23)

And format plays nicely with objects:

class Birthday:
  def __init__(self, age, month, day):
    self.age = age
    self.month = month
    self.day = day

print 'I will be {b.age} on {b.month} {b.day}'.format(b = Birthday(19, 'January', 23))
Blender
  • 289,723
  • 53
  • 439
  • 496
  • 1
    +1 Good point about objects. I also prefer `format()`. I have somehow expanded your answer by showing example how to combine two ways of formatting (as mentioned by OP) into single call using `format()`. – Tadeck Dec 06 '11 at 06:25
  • both answers are great, I didn't know much about `format()`. However, any information about `%` formatting? (that was the main point) – juliomalegria Dec 06 '11 at 15:21
-2

I am not answering the question but just explaining the idea I came up in my TIScript.

I've introduced so called "stringizer" functions: any function with name starting from '$' is a stringizer. Compiler treats '$name(' and ')' as quotes of string literal combined with function call.

Example, this:

$print(I will be {b.age} on {b.month} {b.day});

is actually compiled into

$print("I will be ", b.age, " on ",b.month," ",b.day);

where even arguments are always literal strings and odd ones are expressions. This way it is possible to define custom stringizers that use different formatting/argument processing.

For example Element.$html(Hello <b>{who}</b>); will apply HTML escape on expressions. And this Element.$(option[value={12}]); will do select in jQuery style.

Pretty convenient and flexible.

I am not sure is it possible to do something like this in Python without changing its compiler. Consider just as an idea.

c-smile
  • 26,734
  • 7
  • 59
  • 86
  • 1
    Actually I do not believe such thing would be welcome in Python, not only because this would require syntax changes. There is also a rule saying that **Explicit is better than implicit**. However what you do can be easily accomplished by custom methods / functions, that will preprocess parameters and pass them into the `format()` method of the string you are referring to. Or even something like that: `'Hello {who}'.format(**escape_html(some_data))`. Not a big deal in Python, though. – Tadeck Dec 06 '11 at 08:41
  • sorry, but I will have to downvote for not answering the question nor posting any Python related answer – juliomalegria Dec 06 '11 at 15:00