1

I am new to python and I am struggling with encoding

I have a list of String like this:

keys = ["u'part-00000-6edc0ee4-de74-4f82-9f8c-b4c965896224-c000.csv'", 
        " u'part-00001-6edc0ee4-de74-4f82-9f8c-b4c965896224-c000.csv'"]

I do this to encode

keys = [x.encode('UTF-8') for x in keys]

However I am getting "b" appended, the result being

[b"u'part-00000-6edc0ee4-de74-4f82-9f8c-b4c965896224-c000.csv'", 
 b" u'part-00001-6edc0ee4-de74-4f82-9f8c-b4c965896224-c000.csv'"]

I thought it would be simpler to just encode with utf-8

What am I doing wrong?

Selcuk
  • 57,004
  • 12
  • 102
  • 110
Yogi
  • 1,035
  • 2
  • 13
  • 39
  • Does this answer your question? [Python 3 - Encode/Decode vs Bytes/Str](https://stackoverflow.com/questions/14472650/python-3-encode-decode-vs-bytes-str) – qorka Dec 18 '19 at 00:20
  • 2
    How did you end up with that original list of strings in the first place? It looks like the result of a series of conversions went wrong. Also is this Python 2 or 3? – Selcuk Dec 18 '19 at 00:20
  • 1
    Its python3, I get from an external source, I have no control over – Yogi Dec 18 '19 at 00:28

1 Answers1

1

You should first try fixing the method you use to obtain your original list of strings, but if you have no control on that, you can use the following:

>>> import ast
>>> [ast.literal_eval(i.strip()) for i in keys]

The result should be

[u'part-00000-6edc0ee4-de74-4f82-9f8c-b4c965896224-c000.csv', 
 u'part-00001-6edc0ee4-de74-4f82-9f8c-b4c965896224-c000.csv']

for Python 2, and

['part-00000-6edc0ee4-de74-4f82-9f8c-b4c965896224-c000.csv', 
 'part-00001-6edc0ee4-de74-4f82-9f8c-b4c965896224-c000.csv']

for Python 3.

Selcuk
  • 57,004
  • 12
  • 102
  • 110