Python repr string w/ real newlines

Question

I want to use repr() to get a Python-encoded string literal (that I can paste into some source code), but I'd prefer a triple-quoted string with real newlines rather than the \n escape sequence.

I could post-process the string to convert \n back into a newline char and add a couple more quotes, but then if \\n is in the source, then I wouldn't want to match on that.

What's the easiest way to do this?

Example input:

foo
bar

Or as a Python string:

'foo\nbar'

Desired output:

'''foo\xf0\x9f\x92\xa9
bar'''

Triple-single or triple-double quotes is fine, but I do want it broken on multiple lines like that.

What I have so far:

#!/usr/bin/env python
import sys
import re

with open(sys.argv[1], 'r+') as f:
    data = f.read()
    f.seek(0)
    out = "''" + re.sub(r"\\n", '\n', repr(data)) + "''"
    f.write(out)
    f.truncate()

I'm still trying to figure out the regex to avoid converting escaped \ns.

The goal is that if I paste that back into a Python source file I will get back out exactly the same thing as I read in.

I'm using Python 2.7.14

Isn't that just `print(your_string)`? I don't really get your desired input and output. — wim, Mar 28 '19 at 22:56
@wim No. `repr` will escape quotes, emojis and other control characters, which I do want. — mpen, Mar 28 '19 at 22:59
OK, please post an example input and output. Btw repr will not escape emojis in the current version of Python - maybe you should tag this with python-2.x ? — wim, Mar 28 '19 at 22:59
Are you really sure you want `'foo\nbar'` and not `u'foo\nbar'`? The proper escape here would be `foo\U0001f4a9\nbar` - what you are showing here is utf-8 encoded — wim, Mar 28 '19 at 23:04
Uhh.. yeah, I think you're right. I don't actually have any emoji poops in my source, but there might be some other wonky stuff. I basically just need Python to be able to parse it and come out the same way as the input. — mpen, Mar 28 '19 at 23:08

georg · Accepted Answer · 2019-03-28T23:14:19.997

2

How about splitlines it and encoding each line separately:

s = 'foo\nbar'

r = "'''" + '\n'.join(repr(x)[1:-1] for x in s.splitlines()) + "'''"

assert eval(r) == s

If you're on python2 and the inputs are unicode, then repr[2:-1] to strip the leading u as well. The same applies to py3 and bytes inputs.

edited Mar 28 '19 at 23:14

answered Mar 28 '19 at 23:05

georg

211,518
52
313
390

Smart. Split the lines before calling `repr` to avoid the whole escaping issue. – mpen Mar 28 '19 at 23:09

score 0 · Answer 2 · answered Mar 28 '19 at 23:23

Final solution to convert a text file into a string which you can paste into your source code:

#!/usr/bin/env python
import sys
import re
import io

with io.open(sys.argv[1], 'r+', encoding='utf8') as f:
    data = f.read()
    f.seek(0)
    out = u"u'''" + u'\n'.join(repr(x)[2:-1] for x in data.splitlines()) + u"'''"
    f.write(out)
    f.truncate()

Warning: it overwrites the source file. I'm using temporary files for this, so that's what I wanted.

Credit:

georg
Mark

Python repr string w/ real newlines

2 Answers2