I have searched extensively and while there are tons of resources available for answering this question, I just can't seem to get any workable answer. I have watched this talk from Ned Batchelder on Unicode (https://nedbatchelder.com/text/unipain.html) and read through lots of answers on S.O. but I'm still at a loss.
I'm using Python 3 and BeautifulSoup 4 to scrape and parse a table from wikipedia. I have a list called fighter_B
print(type(fighter_B))
<class 'list'>
print(type(fighter_B[0])
<class 'bs4.element.NavigableString'>
The second and third observations in the list contain names with non-enlgish letters which throw an error, for example, Fabrício Werdum. When I try and print the fighter name I get this error,
print(fighter_B[1])
UnicodeEncodeError: 'ascii' codec can't encode character '\xed' in position 4: ordinal not in range(128)
I've tried various encoding functions but I always end up throwing the same error.
[fighter.encode('utf-8') for fighter in fighter_B]
print(fighter_B[1])
UnicodeEncodeError: 'ascii' codec can't encode character '\xed' in position 4: ordinal not in range(128)
for i in fighter_B:
i.encode('utf-8')
print(fighter_B[1])
UnicodeEncodeError: 'ascii' codec can't encode character '\xed' in position 4: ordinal not in range(128)
[fighter.decode('utf-8') for fighter in fighter_B]
AttributeError: 'NavigableString' object has no attribute 'decode'
[str(fighter).decode('utf-8) for fighter in fighter_B]
AttributeError: 'str' object has no attribute 'decode'
[fighter.encode('ascii') for fighter in fighter_B]
UnicodeEncodeError: 'ascii' codec can't encode character '\xed' in position 4: ordinal not in range(128)
All the various answers I have seen have simply suggested encoding the variable to 'utf-8'. I'm not sure why the encoding isn't working here and I am wondering if it is due to the fact that each item in the list is of type 'bs4.element.NavigableString'. Any tips would be greatly appreciated as I feel totally stumped at this point.