Decoding a list in python

Question

I have a list in python

['Deals', "\\\\xe2\\\\x98\\\\x85What\\\\\\'s New\\\\xe2\\\\x98\\\\x85", '\\\\xc3\\\\x80la carte & Value Meals', 'Crispy Chicken', 'Share Box', 'Happy Meals', 'Desserts', 'McCafe\\\\xc3\\\\xa9', 'Beverages', 'Side Lines', 'Snack Time']

I want to decode it like "\\\\xe2\\\\x98\\\\x85What\\\\\\'s New\\\\xe2\\\\x98\\\\x85", '\\\\xc3\\\\x80la carte & Value Meals','McCafe\\\\xc3\\\\xa9' to 'What's New, la carte & Value Meals,McCafe' The desired output would be without any endcoding (\\x98) Desired output:

['Deals', "What's New", 'la carte & Value Meals', 'Crispy Chicken', 'Share Box', 'Happy Meals', 'Desserts', 'McCafe', 'Beverages', 'Side Lines', 'Snack Time']

Could you please be more specific? You can take your list and replace string.replace('\\x98','') This will replace '\\x98' with ''. It will effectively delete it. — Valentyn, Jun 12 '20 at 21:05
Possible duplicate of [Regular expression that finds and replaces non-ascii characters with Python](https://stackoverflow.com/questions/2758921/regular-expression-that-finds-and-replaces-non-ascii-characters-with-python) (per "please see this answer" link in second answer) — pppery, Jun 12 '20 at 22:14
Really, your list is a bit garbled UTF-8 with result of `★What's New★`, `Àla carte` and `McCafeé`. Accented letters are letters, and _Black Star_ could matter as well… — JosefZ, Jun 12 '20 at 22:15
It seems like what you're asking for is to replace / delete non-ASCII literals (e.g., `\xe2`) but keep any other spaces letters/spaces. [This answer](https://stackoverflow.com/questions/2758921/regular-expression-that-finds-and-replaces-non-ascii-characters-with-python) seems to be the general-purpose solution for finding/replacing non-ASCII characters in a string. — Professor Nuke, Jun 12 '20 at 21:07

Red · Accepted Answer · 2020-06-12T21:16:05.367

You can use the built in module, re:

import re
l = ['Deals', "\\\\xe2\\\\x98\\\\x85What\\\\\\'s New\\\\xe2\\\\x98\\\\x85", '\\\\xc3\\\\x80la carte & Value Meals', 'Crispy Chicken', 'Share Box', 'Happy Meals', 'Desserts', 'McCafe\\\\xc3\\\\xa9', 'Beverages', 'Side Lines', 'Snack Time']
l = [re.sub(r'x\S\d',' ',s).replace('\\','').strip() for s in l]
print(l)

Output:

['Deals', "What's New", 'la carte & Value Meals', 'Crispy Chicken', 'Share Box', 'Happy Meals', 'Desserts', 'McCafe', 'Beverages', 'Side Lines', 'Snack Time']

re.sub(r'x\S\d',' ',s) will return the string s with every 3 neighboring characters that are in this pattern:

x, any character that's not a space, number

replaced with a space.

I'm downvoting because it's not universal solution. For instance, _Latin Small Letter E With Diaeresis_ with UTF-8 byte sequence `0xC3`, `0xAB` does not match your pattern, see also [this comment](https://stackoverflow.com/questions/62352305/decoding-a-list-in-python#comment110277581_62352305). — JosefZ, Jun 12 '20 at 22:40

Decoding a list in python

1 Answers1