I downloaded the sentiment140 dataset and tried opening it using pd.read_csv()
and I got the UnicodeDecodeError: 'utf-8' codec can't decode bytes in position 232719-232720: invalid continuation byte
Then, I specified the 'utf-8' encoding parameter in the read_csv()
function after getting the file encoding info using the unix file command but I'm still getting the same error.