1

I have rows like the following in a plain text file:

  181006\td3a8d0236\tNicol\xc3\xa1s\tPe\xc3\xb1a\tmisc.person@email.com

I'd like to open and read the file using Python, then print out each line in its decoded form:

  181006 d3a8d0236        Nicolás Peña    misc.person@email.com

As a literal string this is pretty easy...

import codecs
a = b'181006\t000d3a8d0236\tNicol\xc3\xa1s\tPe\xc3\xb1a\tmisc.person@email.com'
b = codecs.decode(a)
print(b)

However, try as I may, I can't seem to find the b'' literal syntax equivalent for data in a variable. There are multiple SO posts about this, but I've had no luck using open()/read()/write() etc. Can someone offer a suggestion?

bad_coder
  • 11,289
  • 20
  • 44
  • 72
JRomeo
  • 543
  • 1
  • 4
  • 20
  • 1
    Does this answer your question? [What is the equivalent to b'string' on a variable?](https://stackoverflow.com/questions/23072834/what-is-the-equivalent-to-bstring-on-a-variable) – GockOckLock Apr 21 '21 at 03:06
  • 1
    You have code to read the file? `open("somefile.csv", encoding="utf-8")` should open the file correctly. – tdelaney Apr 21 '21 at 03:44
  • Read the file as binary, then `s = s.decode('unicode_escape').encode('latin1').decode('utf8')` should work if you have literal escape codes in the file. – Mark Tolonen Apr 21 '21 at 18:39

1 Answers1

0

Have you tried .encode('utf_8') as this will give you the "'b".

Example:

a = str('181006\td3a8d0236\tNicol\xc3\xa1s\tPe\xc3\xb1a\tmisc.person@email.com')
print(a.encode('utf_8'))
b'181006\td3a8d0236\tNicol\xc3\x83\xc2\xa1s\tPe\xc3\x83\xc2\xb1a\tmisc.person@email.com'

Then you can apply the decode.