3

I extract javascript code from PDF, but it is converted octal escape sequences.

I want to convert it to normal JavaScript code.

\040\040\040\040\146\165\156\143\164\151\157\156\040\163\167\050\051\17....

Please advise me.

hucuhy
  • 809
  • 3
  • 9
  • 15
  • See the linked duplicate if you actually have a string with backslashes in it (for example, by reading data from the PDF). If you just have something like that in your source code, then nothing needs to be done. – Karl Knechtel Aug 04 '22 at 23:17

2 Answers2

3

You can use unicode_escape encoding:

In Python 2.x:

>>> r'\040\040\040\040\146\165\156\143\164\151\157\156'.decode('unicode-escape')
u'    function'

In Python 3.x:

>>> br'\040\040\040\040\146\165\156\143\164\151\157\156'.decode('unicode-escape')
'    function'
falsetru
  • 357,413
  • 63
  • 732
  • 636
1

This works for both Python 2.x and 3.x:

>>> b'\040\040\040\040\146\165\156\143\164\151\157\156\040\163\167'.decode('utf-8')
'    function sw'
user3286261
  • 391
  • 3
  • 7