I believe in the unicode sandwich. I use the unicode sandwich. So why is it that when I run the following on a byte string (py 2.7)...
label = label.decode("utf-8")
I still get an error:
Traceback (most recent call last):
File "/usr/local/lib/python2.7/site-packages/celery/app/trace.py", line 385, in trace_task
R = retval = fun(*args, **kwargs)
File "/usr/local/lib/python2.7/site-packages/celery/app/trace.py", line 648, in __protected_call__
return self.run(*args, **kwargs)
File "/opt/celery/cl/scrapers/tasks.py", line 638, in update_docket_info_iquery
d = update_docket_metadata(d, report.metadata)
File "/usr/local/lib/python2.7/site-packages/juriscraper/pacer/case_query.py", line 166, in metadata
self._get_label_value_pair(bold, True, field_names)
File "/usr/local/lib/python2.7/site-packages/juriscraper/pacer/docket_report.py", line 233, in _get_label_value_pair
label = label.decode("utf-8") <---- Shouldn't this work?
File "/usr/local/lib/python2.7/encodings/utf_8.py", line 16, in decode
return codecs.utf_8_decode(input, errors, True)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xa0' in position 6: ordinal not in range(128)
And, why is this throwing a UnicodeEncodeError
when I'm trying to do a decode on the line that crashes?
I'm confused. Again.