0

Lets say that i have function load_from_xml() which loads string from XML file. String contains escape sequences for example: "Here\x20are\x20spaces". When i want to print it I get:

s = str(load_from_xml())
print(s)
>"Here\x20are\x20spaces"

which is not desired output. Desired output would be:

>"Here are spaces"

Any idea why print() ignores escape sequences?

Edit: Sample of function of load_from_xml():

import xml.etree.ElementTree as ET

def load_from_xml():

    xml_string = "<data>Here\\x20are\\x20spaces</data>"  # double \\ so data are stored like \x20 
    root = ET.fromstring(xml_string)
    return root.text
Fredyman
  • 11
  • 3
  • Could you add an example for your 'load_from_xml()' function? I ran the program using your example string and I had no problems. – Ismail Hafeez Mar 07 '21 at 16:16
  • Just found out simmilar question and answer is [here](https://stackoverflow.com/questions/4020539/process-escape-sequences-in-a-string-in-python/4020824#4020824) – Fredyman Mar 07 '21 at 16:59
  • `print()` isn't "ignoring" escape sequences; `print` is *not responsible for them in the first place*. Unless you explicitly use code to simulate the effect, escape sequences **only** apply to string *literals, in the source code of your program*, and their effect is applied *before the code runs*. – Karl Knechtel Aug 06 '22 at 00:21

2 Answers2

-1

It's unquote you're looking for :)

import urllib
urllib.parse.unquote("Here\x20are\x20spaces")

EDIT :

Then, simply :

import urllib
urllib.parse.unquote(load_from_xml().replace(r'\\', r'\'))

EDIT :

But no, actually, the simple first solution I gave you will work, looking at your function, you already have simple slashes in your example...

And to answer to your final question : Because it's not a usual escape sequence, it's an url quoting sequence...

Icarwiz
  • 184
  • 6
  • This might work but string that I am getting as input should be written more like "Here\\x20are\\x20spaces" so that the escape sequence is written as string not like unicode char. – Fredyman Mar 07 '21 at 16:50
  • This answer is **completely wrong**. "And to answer to your final question : Because it's not a usual escape sequence, it's an url quoting sequence..." No, the escape sequence in question is not specific to URLs at all. The point is that the string **actually contains** a backslash, lowercase x, two and zero. The code here will not work and makes no sense. `"Here\x20are\x20spaces"` *in the source code* **already** contains spaces, so `urllib.parse.unquote` does nothing; `r'\'` [is not valid](https://stackoverflow.com/questions/647769/) and trying this kind of replacement makes no sense anyway. – Karl Knechtel Aug 06 '22 at 00:25
-1
a="Here\x20are\x20spaces"
"".join(c for c in a if  c.isprintable())

Output:

'Here are spaces'

str.isprintable

Return True if all characters in the string are printable or the string is empty, False otherwise. Nonprintable characters are those characters defined in the Unicode character database as “Other” or “Separator”, excepting the ASCII space (0x20) which is considered printable.

Ajay
  • 5,267
  • 2
  • 23
  • 30
  • This might work but I am looking for something that can convert all escape sequences not just spaces. – Fredyman Mar 07 '21 at 16:46
  • This answer is wrong, and misses the point entirely. Writing `a="Here\x20are\x20spaces"` **in source code** creates a string that **already contains** perfectly ordinary spaces, so there is nothing to replace. `\x20` is a code **for a space**. The problem is that the input **does not** contain spaces; it contains *actual backslashes etc.* which need to be converted. Backslash is a printable character, so this would not filter anything out, and it shouldn't be filtered out anyway - it should be converted. – Karl Knechtel Aug 06 '22 at 00:27