How to convert octal escape sequences with Python

Question

I extract javascript code from PDF, but it is converted octal escape sequences.

I want to convert it to normal JavaScript code.

\040\040\040\040\146\165\156\143\164\151\157\156\040\163\167\050\051\17....

Please advise me.

See the linked duplicate if you actually have a string with backslashes in it (for example, by reading data from the PDF). If you just have something like that in your source code, then nothing needs to be done. — Karl Knechtel, Aug 04 '22 at 23:17

score 3 · Answer 1 · answered Apr 19 '14 at 17:50

In Python 2.x:

>>> r'\040\040\040\040\146\165\156\143\164\151\157\156'.decode('unicode-escape')
u'    function'

In Python 3.x:

>>> br'\040\040\040\040\146\165\156\143\164\151\157\156'.decode('unicode-escape')
'    function'

user3286261 · Accepted Answer · 2014-04-19T18:08:59.033

1

This works for both Python 2.x and 3.x:

>>> b'\040\040\040\040\146\165\156\143\164\151\157\156\040\163\167'.decode('utf-8')
'    function sw'

edited Apr 19 '14 at 18:08

answered Apr 19 '14 at 17:58

user3286261

If you use escaped sequence (non-raw-string), you don't need to use `decode`. – falsetru Apr 19 '14 at 18:06
1

Yeah but you end up with a byte array instead of a string. – user3286261 Apr 19 '14 at 18:12
You're right. Without the `decode` call, you will get a `bytes` objects. (but only in Python 3.x) – falsetru Apr 19 '14 at 18:18

2 Answers2