0

I was getting the same error when the code was open(filname). Following the discussion here, Python pickle error: UnicodeDecodeError

I changed it to open(filname, 'rb'). I am still getting the error.

/Users/sheetalpandrekar/Google Drive/Research/Health-care Analytics/Twitter/Opioid/TwitterUserType/PrepareData/Twitter_User_Types/Classifier/TwoPhaseTwitterClassifier2.py in loadClassifierFromFile(self, class1, class2)
     32 
     33     def loadClassifierFromFile(self,class1,class2):
---> 34         self.clf1=pickle.load(open(class1, 'rb'))
     35         self.clf2=pickle.load(open(class2, 'rb'))
     36         pass

/Library/anaconda/lib/python3.5/codecs.py in decode(self, input, final)
    319         # decode input (taking the buffer into account)
    320         data = self.buffer + input
--> 321         (result, consumed) = self._buffer_decode(data, self.errors, final)
    322         # keep undecoded input until the next call
    323         self.buffer = data[consumed:]

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte
sheetal_158
  • 7,391
  • 6
  • 27
  • 44
  • Did you open the file for *write* in text mode (easy to forget on Python 2) on a Windows system? If so, your pickle file is probably corrupt (`\n` converted to `\r\n` as it's written). Your error is caused by trying to decode non-UTF-8 bytes as UTF-8, which shouldn't happen for a valid pickle read in binary mode, but if the data was corrupted by line ending translation on write, you could end up misaligned when parsing and start trying to interpret random bytes as a pickle format codes and data. – ShadowRanger Oct 24 '17 at 22:51
  • have you tried `self.clf1=pickle.load(open(class1, 'rb'), encoding='latin1')`? – rodgdor Oct 26 '17 at 23:42

0 Answers0