603

What is the proper indentation for Python multiline strings within a function?

    def method():
        string = """line one
line two
line three"""

or

    def method():
        string = """line one
        line two
        line three"""

or something else?

It looks kind of weird to have the string hanging outside the function in the first example.

martineau
  • 119,623
  • 25
  • 170
  • 301
ensnare
  • 40,069
  • 64
  • 158
  • 224
  • 10
    Docstrings are treated [specially](http://stackoverflow.com/questions/16534246/docstring-processing-tools-following-pep-257): any indent of the first line is removed; the smallest common indent taken over all other non-blank lines is removed from them all. Other than that, multiline string literals in Python are unfortunately what-you-see-is-what-you-get in terms of whitespace: all characters between the string delimiters become part of the string, including indentation that, with Python reading instincts, looks like it should be measured from the indent of the line where the literal starts. – Evgeni Sergeev Sep 02 '15 at 07:30
  • 3
    @EvgeniSergeev The processing tool performs this task (and that largely depends on your choice of processing tool). `method.__doc__` isn't modified by Python itself any more than any other `str` literal. – c z Jun 06 '19 at 14:49

12 Answers12

526

You probably want to line up with the """

def foo():
    string = """line one
             line two
             line three"""

Since the newlines and spaces are included in the string itself, you will have to postprocess it. If you don't want to do that and you have a whole lot of text, you might want to store it separately in a text file. If a text file does not work well for your application and you don't want to postprocess, I'd probably go with

def foo():
    string = ("this is an "
              "implicitly joined "
              "string")

If you want to postprocess a multiline string to trim out the parts you don't need, you should consider the textwrap module or the technique for postprocessing docstrings presented in PEP 257:

def trim(docstring):
    import sys
    if not docstring:
        return ''
    # Convert tabs to spaces (following the normal Python rules)
    # and split into a list of lines:
    lines = docstring.expandtabs().splitlines()
    # Determine minimum indentation (first line doesn't count):
    indent = sys.maxint
    for line in lines[1:]:
        stripped = line.lstrip()
        if stripped:
            indent = min(indent, len(line) - len(stripped))
    # Remove indentation (first line is special):
    trimmed = [lines[0].strip()]
    if indent < sys.maxint:
        for line in lines[1:]:
            trimmed.append(line[indent:].rstrip())
    # Strip off trailing and leading blank lines:
    while trimmed and not trimmed[-1]:
        trimmed.pop()
    while trimmed and not trimmed[0]:
        trimmed.pop(0)
    # Return a single string:
    return '\n'.join(trimmed)
MoRe
  • 2,296
  • 2
  • 3
  • 23
Mike Graham
  • 73,987
  • 14
  • 101
  • 130
  • 1
    It's good to have the same level of indentation for each line of text in the string. But that means that the lines of text should be a single indentation level in (4 columns), not starting at some arbitrary many-columns-along position from the previous line. – bignose Mar 24 '10 at 00:06
  • @bignose, I do not see how that requirement helps to keep code cleaner or more readable or has any particular advantage. – Mike Graham Mar 24 '10 at 00:08
  • 16
    This is the ‘hanging indent’ style of line continuation. It is prescribed in PEP8 for purposes like function definitions and long if statements, though not mentioned for multiline strings. Personally this is one place I refuse to follow PEP8 (and use 4-space indenting instead), as I strongly dislike hanging indents, which for me obscure the proper structure of the program. – bobince Mar 24 '10 at 12:19
  • @Mike Where is this usage of string described ? string = ("this is an " "implicitly joined " "string") – user Jun 26 '11 at 11:22
  • 2
    @buffer, in 3.1.2 of the official tutorial ("Two string literals next to each other are automatically concatenated...") and in the language reference. – Mike Graham Jun 28 '11 at 18:55
  • The second form with automatic string concatenation doesn't include newlines, so you need to add \n for every line. This method is also very unnatural to insert blank lines (you need to have "\n\n" lines). I wish there is support similar to the the bash's here document with leading tabs suppressed (<<-END). – haridsv Oct 27 '11 at 05:44
  • 5
    *The second form with automatic string concatenation doesn't include newline* It's a feature. – Mike Graham Oct 27 '11 at 19:05
  • See also "Implicit string literal concatenation considered harmful?" by Guido van Rossum (some random guy...) http://lwn.net/Articles/551426/ – MarcH Jun 19 '13 at 06:53
  • It's worth recognizing that, while implicit concatenation is a misfeature that Python would be better for not having, this particular use isn't particularly hard to see or understand and isn't especially error-prone. – Mike Graham Jul 30 '14 at 13:06
  • 1
    +1 for the `("this is an "` ... technique. it works as a docstring. if you need newline, you can use `("this is an " "\n"` – n611x007 Aug 13 '14 at 11:23
  • 31
    The `trim()` function as specified in PEP257 is implemented in the standard library as [`inspect.cleandoc`](https://docs.python.org/2/library/inspect.html#inspect.cleandoc). –  Nov 28 '14 at 10:42
  • I found that the 2nd example to work in python 3, I had to add also trailing \ to the end of the lines. Otherwise I get "unexpected indent" errors. – MrTJ Nov 15 '17 at 11:27
  • @MrTJ It works for me in Python 3 -- did you have the parens? – Mike Graham Nov 15 '17 at 12:36
  • 4
    +1 to @bobince 's comment about rejecting "hanging indents" here... Especially because if you change the variable name from `string` to `text` or anything of a different length, then you now need to update the indentation of *literally every single line of the multiline string* just to get it to match up with the `"""` properly. Indentation strategy should not complicate future refactors/maintenance, and it's one of the places that PEP really fails – kevlarr Jan 10 '18 at 20:21
  • 1
    While this is a solid answer, do yourselves a favor and take a look at @wihlke's [answer](https://stackoverflow.com/a/48112903/760306) for the cleanest solution. – Johannes Pille Jan 09 '20 at 05:27
370

The textwrap.dedent function allows one to start with correct indentation in the source, and then strip it from the text before use.

The trade-off, as noted by some others, is that this is an extra function call on the literal; take this into account when deciding where to place these literals in your code.

import textwrap

def frobnicate(param):
    """ Frobnicate the scrognate param.

        The Weebly-Ruckford algorithm is employed to frobnicate
        the scrognate to within an inch of its life.
        """
    prepare_the_comfy_chair(param)
    log_message = textwrap.dedent("""\
            Prepare to frobnicate:
            Here it comes...
                Any moment now.
            And: Frobnicate!""")
    weebly(param, log_message)
    ruckford(param)

The trailing \ in the log message literal is to ensure that line break isn't in the literal; that way, the literal doesn't start with a blank line, and instead starts with the next full line.

The return value from textwrap.dedent is the input string with all common leading whitespace indentation removed on each line of the string. So the above log_message value will be:

Prepare to frobnicate:
Here it comes...
    Any moment now.
And: Frobnicate!
bignose
  • 30,281
  • 14
  • 77
  • 110
  • 5
    While this is a reasonable solution and nice to know, doing something like this inside a frequently called function could prove to be a disaster. – haridsv Oct 27 '11 at 05:45
  • 2
    @haridsv Why would that be a disaster? – jtmoulia Jun 05 '12 at 18:14
  • 14
    @jtmoulia: A better description than disaster would be "inefficient" because the result of the `textwrap.dedent()` call is a constant value, just like its input argument. – martineau Aug 04 '12 at 01:34
  • 3
    @haridsv the origin of that disaster/inefficiency is *definining* a constant string **inside** a frequently called function. Possible to trade the per-call constant definition for a per-call lookup. That way the *dedent* preprocessing would run *only once*. A relevant question may be http://stackoverflow.com/q/15495376/611007 It lists ideas to avoid defining the constant per each call. Albeit alternatives seems to require a lookup. Still, various ways to find the favorable place to store it are attempted. For example: `def foo: return foo.x` then next line `foo.x = textwrap.dedent("bar")`. – n611x007 Aug 13 '14 at 12:11
  • 1
    I guess it would be inefficient if the string is intended for logging that is only enabled in debug mode, and goes unused otherwise. But then why log a multiline string literal anyway? So it's hard to find a real-life example where the above would be inefficient (i.e. where it slows down the program considerably), because whatever is consuming these strings is going to be slower. – Evgeni Sergeev Sep 02 '15 at 04:21
160

Use inspect.cleandoc like so:

import inspect

def method():
    string = inspect.cleandoc("""
        line one
        line two
        line three""")

Relative indentation will be maintained as expected. As commented below, if you want to keep preceding empty lines, use textwrap.dedent. However that also keeps the first line break.

Note: It's good practice to indent logical blocks of code under its related context to clarify the structure. E.g. the multi-line string belonging to the variable string.

wihlke
  • 2,455
  • 1
  • 19
  • 18
  • 10
    So confused why this answer didn't exist until now, `inspect.cleandoc` has existed ever since [Python 2.6](https://docs.python.org/2/library/inspect.html#inspect.cleandoc), which was [2008](https://www.python.org/download/releases/2.6/)..? Absolutely the cleanest answer, especially because it doesn't use the hanging indent style, which just wastes an unnecessary amount of space – kevlarr Jan 10 '18 at 20:35
  • 1
    This solution removes the first few lines of blank text (if any). If you don't want that behavior, use textwrap.dedent https://docs.python.org/2/library/textwrap.html#textwrap.dedent – joshuakcockrell Sep 04 '19 at 00:04
  • Example in docs suggests 'end first line with \ to avoid the empty line!', which resolves the 'keeps the first line break' aspect referred to above. https://docs.python.org/3/library/textwrap.html – autopoietic Mar 03 '23 at 07:37
38

One option which seems to missing from the other answers (only mentioned deep down in a comment by naxa) is the following:

def foo():
    string = ("line one\n"          # Add \n in the string
              "line two"  "\n"      # Add "\n" after the string
              "line three\n")

This will allow proper aligning, join the lines implicitly, and still keep the line shift which, for me, is one of the reasons why I would like to use multiline strings anyway.

It doesn't require any postprocessing, but you need to manually add the \n at any given place that you want the line to end. Either inline or as a separate string after. The latter is easier to copy-paste in.

holroy
  • 3,047
  • 25
  • 41
  • 6
    Note that this is an example of an implicitly joined string, not a multiline string. – trk Jul 08 '18 at 08:49
  • 2
    @trk, it's multiline in the sense that the string contains newlines (aka multiple lines), but yes it uses joining to circumvent the formatting issues the OP had. – holroy Jul 08 '18 at 10:02
  • 1
    This looks like the best answer for me. But so far I don't understand why does python needs triple quotes operator if they result in a hard-to-read code. – klm123 May 20 '21 at 08:16
21

Some more options. In Ipython with pylab enabled, dedent is already in the namespace. I checked and it is from matplotlib. Or it can be imported with:

from matplotlib.cbook import dedent

In documentation it states that it is faster than the textwrap equivalent one and in my tests in ipython it is indeed 3 times faster on average with my quick tests. It also has the benefit that it discards any leading blank lines this allows you to be flexible in how you construct the string:

"""
line 1 of string
line 2 of string
"""

"""\
line 1 of string
line 2 of string
"""

"""line 1 of string
line 2 of string
"""

Using the matplotlib dedent on these three examples will give the same sensible result. The textwrap dedent function will have a leading blank line with 1st example.

Obvious disadvantage is that textwrap is in standard library while matplotlib is external module.

Some tradeoffs here... the dedent functions make your code more readable where the strings get defined, but require processing later to get the string in usable format. In docstrings it is obvious that you should use correct indentation as most uses of the docstring will do the required processing.

When I need a non long string in my code I find the following admittedly ugly code where I let the long string drop out of the enclosing indentation. Definitely fails on "Beautiful is better than ugly.", but one could argue that it is simpler and more explicit than the dedent alternative.

def example():
    long_string = '''\
Lorem ipsum dolor sit amet, consectetur adipisicing
elit, sed do eiusmod tempor incididunt ut labore et
dolore magna aliqua. Ut enim ad minim veniam, quis
nostrud exercitation ullamco laboris nisi ut aliquip.\
'''
    return long_string

print example()
Joop
  • 7,840
  • 9
  • 43
  • 58
11

If you want a quick&easy solution and save yourself from typing newlines, you could opt for a list instead, e.g.:

def func(*args, **kwargs):
    string = '\n'.join([
        'first line of very long string and',
        'second line of the same long thing and',
        'third line of ...',
        'and so on...',
        ])
    print(string)
    return
steabert
  • 6,540
  • 2
  • 26
  • 32
  • 2
    While this isn't the best approach, I've used it from time to time. If you *do* use it, you should use a tuple instead of a list, since it's not going to be modified before being joined. – Lyndsy Simon Mar 22 '18 at 18:00
4

I prefer

    def method():
        string = \
"""\
line one
line two
line three\
"""

or

    def method():
        string = """\
line one
line two
line three\
"""
lk_vc
  • 1,136
  • 20
  • 26
3

My two cents, escape the end of line to get the indents:

def foo():
    return "{}\n"\
           "freq: {}\n"\
           "temp: {}\n".format( time, freq, temp )
Simon
  • 456
  • 6
  • 11
1

I came here looking for a simple 1-liner to remove/correct the identation level of the docstring for printing, without making it look untidy, for example by making it "hang outside the function" within the script.

Here's what I ended up doing:

import string
def myfunction():

    """
    line 1 of docstring
    line 2 of docstring
    line 3 of docstring"""

print str(string.replace(myfunction.__doc__,'\n\t','\n'))[1:] 

Obviously, if you're indenting with spaces (e.g. 4) rather than the tab key use something like this instead:

print str(string.replace(myfunction.__doc__,'\n    ','\n'))[1:]

And you don't need to remove the first character if you like your docstrings to look like this instead:

    """line 1 of docstring
    line 2 of docstring
    line 3 of docstring"""

print string.replace(myfunction.__doc__,'\n\t','\n') 
0

For strings you can just after process the string. For docstrings you need to after process the function instead. Here is a solution for both that is still readable.

class Lstrip(object):
    def __rsub__(self, other):
        import re
        return re.sub('^\n', '', re.sub('\n$', '', re.sub('\n\s+', '\n', other)))

msg = '''
      Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod
      tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim
      veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea
      commodo consequat. Duis aute irure dolor in reprehenderit in voluptate
      velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat
      cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id
      est laborum.
      ''' - Lstrip()

print msg

def lstrip_docstring(func):
    func.__doc__ = func.__doc__ - Lstrip()
    return func

@lstrip_docstring
def foo():
    '''
    Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod
    tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim
    veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea
    commodo consequat. Duis aute irure dolor in reprehenderit in voluptate
    velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat
    cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id
    est laborum.
    '''
    pass


print foo.__doc__
geckos
  • 5,687
  • 1
  • 41
  • 53
  • 3
    Processing docstrings must already process consistent indentation, as [described in PEP 257](http://www.python.org/dev/peps/pep-0257/#handling-docstring-indentation). There are already tools – e.g. [`inspect.cleandoc`](https://docs.python.org/3/library/inspect.html#inspect.cleandoc) – which do this the right way. – bignose Aug 15 '17 at 01:18
0

The first option is the good one - with indentation included. It is in python style - provides readability for the code.

To display it properly:

print string.lstrip()
Bog Dia
  • 17
  • 2
  • This seems like the simplest and cleanest way to format triple quote strings so you don't have the extra spaces due to indentation – Taylor Liss May 05 '18 at 00:59
  • 13
    This will only delete leading spaces in the first line of a multiline string. It does not help with formatting following lines. – M. Schlenker Feb 21 '19 at 08:58
-2

It depends on how you want the text to display. If you want it all to be left-aligned then either format it as in the first snippet or iterate through the lines left-trimming all the space.

Ignacio Vazquez-Abrams
  • 776,304
  • 153
  • 1,341
  • 1,358
  • 5
    The way docstring-processing tools work is to remove not *all* the space on the left, but *as much* as the first indented line. This strategy is a bit more sophisticated and allows for you to indent and have it respected in the postprocessed string. – Mike Graham Mar 23 '10 at 23:59