I'm trying to make a list of locations from a column of a csv file in Python.
This is one entry in the column:
Rio Balira del Orien,Riu Valira d'Orient,Riu Valira d’Orient,Río Balira del Orien
This is the corresponding list in its current state:
locs = ['Rio Balira del Orien', "Riu Valira d'Orient", 'Riu Valira d\xe2\x80\x99Orient', 'R\xc3\xado Balira del Orien']
In my program, I need to check if a given word is in the list, so I'm trying to remove the crazy string formatting (ex. \xc3\xad = í
) for accented letters, apostrophes, etc. and just have each location be in simple lowercase ascii. When I try to use the code
loclist = [x.encode('ascii').lower() for x in locs]
it throws the error:
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 12: ordinal not in range(128)
What command should I use instead?
Thanks!