9

Is there anyway in python to add additional conversion types to string formatting?

The standard conversion types used in %-based string formatting are things like s for strings, d for decimals, etc. What I'd like to do is add a new character for which I can specify a custom handler (for instance a lambda function) that will return the string to insert.

For instance, I'd like to add h as a conversion type to specify that the string should be escaped for using in HTML. As an example:

#!/usr/bin/python

print "<title>%(TITLE)h</title>" % {"TITLE": "Proof that 12 < 6"}

And this would use cgi.escape on the "TITLE" to produce the following output:

<title>Proof that 12 &lt; 6</title>
brianmearns
  • 9,581
  • 10
  • 52
  • 79
  • 1
    you can't add a new placeholder type to string formatting, but you can always drop your input data into a function, that will return your desired output as a string.. `'%(TITLE)s' % {'TITLE': my_html_formatter( 'Proof that 12 < 6' )}` – Peter Varo Nov 08 '13 at 16:49
  • Thanks, I know. I've got a bunch of different strings I'm going to be passing in, I was hoping to come up with a nicer was than passing them all to a function separately. I was also hoping to be able to use the same key (e.g., "TITLE") multiple times with different formatting. – brianmearns Nov 08 '13 at 16:56

3 Answers3

17

You can create a custom formatter for html templates:

import string, cgi

class Template(string.Formatter):
    def format_field(self, value, spec):
        if spec.endswith('h'):
            value = cgi.escape(value)
            spec = spec[:-1] + 's'
        return super(Template, self).format_field(value, spec)

print Template().format('{0:h} {1:d}', "<hello>", 123)

Note that all conversion takes place inside the template class, no change of input data is required.

georg
  • 211,518
  • 52
  • 313
  • 390
  • Interesting. I like the idea of not having to use custom types. I'm going to give both a try and see what works out better before picking an answer. – brianmearns Nov 08 '13 at 17:36
  • 1
    I ended up selecting this method because it doesn't rely on having to wrap everything in custom types that override `__format__`. However, you can also combine it with custom types that override `__format__` without any problems. – brianmearns Nov 08 '13 at 18:53
  • Edited for more accurate spec processing. – georg Nov 08 '13 at 19:07
  • 2
    Note there is also a simpler `convert_field` method on the `Formatter` class, which handles format-field conversions, like the built-in `!s` and `!r`. – brianmearns Nov 19 '14 at 15:10
9

Not with % formatting, no, that is not expandable.

You can specify different formatting options when using the newer format string syntax defined for str.format() and format(). Custom types can implement a __format__() method, and that will be called with the format specification used in the template string:

import cgi

class HTMLEscapedString(unicode):
    def __format__(self, spec):
        value = unicode(self)
        if spec.endswith('h'):
            value = cgi.escape(value)
            spec = spec[:-1] + 's'
        return format(value, spec)

This does require that you use a custom type for your strings:

>>> title = HTMLEscapedString(u'Proof that 12 < 6')
>>> print "<title>{:h}</title>".format(title)
<title>Proof that 12 &lt; 6</title>

For most cases, it is easier just to format the string before handing it to the template, or use a dedicated HTML templating library such as Chameleon, Mako or Jinja2; these handle HTML escaping for you.

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
  • Thanks. I just found the same solution, but your example is really useful. – brianmearns Nov 08 '13 at 16:57
  • This class doesn't make much sense to me, you could simply say `"{:s}".format(cgi.escape(title)...` - exactly what the OP tries to avoid. – georg Nov 08 '13 at 17:15
  • @thg435: That's what I state at the end anyway. It is mostly an example of how you can hook into string formatting with custom formatting specifications. – Martijn Pieters Nov 08 '13 at 17:18
  • @thg435: This does offer a little more flexibility. For instance, I can use the same format key multiple times with different format/conversion types. It also encapsulates the HTML escaping, so if I want to do something other than `cgi.escape`, for instance, I only have to change it in one place and not think about it anywhere else. – brianmearns Nov 08 '13 at 17:33
4

I'm a bit late to the party, but here's what I do, based on an idea in https://mail.python.org/pipermail/python-ideas/2011-March/009426.html

>>> import string, cgi
>>> from xml.sax.saxutils import quoteattr
>>> class MyFormatter(string.Formatter):
    def convert_field(self, value, conversion, _entities={'"': '&quot;'}):
        if 'Q' == conversion:
            return quoteattr(value, _entities)
        else:
            return super(MyFormatter, self).convert_field(value, conversion)

>>> fmt = MyFormatter().format
>>> fmt('{0!Q}', '<hello> "world"')
'"&lt;hello&gt; &quot;world&quot;"'
samwyse
  • 2,760
  • 1
  • 27
  • 38