I'm pulling some licensure data and placing it into a list.
rank = ['\r\n\t\t', 'RANK2', 'Rank II', '07', '-', '01', '-', '2016', u'\xa0', '06', '-', '30', '-', '2021', u'\xa0', '\r\n\t']
cert = ['\r\n\t\t', 'KEL', 'Professional Certificate For Teaching In Elementary School, Primary Through Grade 5', '07', '-', '01', '-', '2016', u'\xa0', '06', '-', '30', '-', '2021', u'\xa0', '\r\n\t']
I want to remove the unicode characters and non-ascii characters from my lists and ultimately get my lists to look like this:
rank = ['RANK2', 'Rank II', '07-01-2016', '06-30-2021']
cert = ['KEL', 'Professional Certificate For Teaching In Elementary School, Primary Through Grade 5', '07-01-2016', '06-30-2021']
I've looked through some other questions that remove escape sequences from lists, remove unicode, remove non-ascii, and some others but I can't get them to work for my situation.
Some get close but no cigar:
[word for word in cert if word.isalnum()]
>>> ['KEL', '07', '01', '2016', '06', '30', '2021']
def recursive_map(lst, fn):
return [recursive_map(x, fn) if isinstance(x, list) else fn(x) for x in lst]
recursive_map(rank, lambda x: x.encode("ascii", "ignore"))
>>>['\r\n\t\t', 'RANK2', 'Rank II', '07', '-', '01', '-', '2016', '', '06', '-', '30', '-', '2021', '', '\r\n\t']
I'm stuck in a rut at the moment...anyone have any ideas?