I am having some trouble encoding ascii characters to UTF-8, or a string is not picking up the encoding.
import unicodecsv as csv
import re
import pyodbc
import sys
import unicodedata
#!/usr/bin/python
# -*- coding: UTF-8 -*-
def remove_non_ascii_1(text):
text.encode('utf-8')
for i in text:
return ''.join(i for i in text if i=='£')
In Python 2.7 I get the error
SyntaxError: Non-ASCII character '\xc2' in file on line 16, but no encoding declared; see SyntaxError: Non-ASCII character '\xc2' in file.
With the Unicode replacement
return ''.join(i for i in text if i=='\xc2')
the error is
UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal
Sample text :
row from a csv file reading in
[u'06/11/2020', u'ABC', u'32154E', u'3214', u'DEF', u'Cash Purchase', u'Final', u'', u'20.00%', u'ABC', u'Sold From Pickup', u'New ', u'10.00%', u'0', u'15%', u'\xa3469.84', u'Jonathan Jerome', u'3', u'\xa3140.95', u'2%', u'\xa393.97', u'\xa39,396.83', u'', u'\xa35,638.00', u'30/06/2020', u'4', u'Boiler-Amended']
I want to remove the \xa3 or £ in the currency fields.