-1

I follow this website and until the second last line, it works well.

I encounter a error during

sample_df = pd.read_csv(io.StringIO(uploaded['sa.csv'].decode('utf-8')))
sample_df.head()

For sample_df = pd.read_csv(io.StringIO(uploaded['sa.csv'].decode('utf-8'))), it stated this:

UnicodeDecodeError                        Traceback (most recent call last)
<ipython-input-44-c79110307396> in <module>()
----> 1 sample_df = pd.read_csv(io.StringIO(uploaded['sa.csv'].decode('utf-8')))

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe2 in position 2716736: invalid continuation byte

For sample_df.head(), it stated this:

NameError                                 Traceback (most recent call last)
<ipython-input-43-c589eab13420> in <module>()
----> 1 sample_df.head()

NameError: name 'sample_df' is not defined

Can someone help me pls with this problem?

Kuldeep Singh Sidhu
  • 3,748
  • 2
  • 12
  • 22

1 Answers1

0

Your sample_df = pd.read_csv(io.StringIO(uploaded['sa.csv'].decode('utf-8'))) didnot execute it gave encoding error!

So sample_df was not created, that is why you got the error NameError: name 'sample_df' is not defined

You can try: pd.read_csv('file', encoding = "ISO-8859-1")

You can also use one of several alias options like 'latin' instead of 'ISO-8859-1' (see python docs, also for numerous other encodings you may encounter).

See relevant Pandas documentationpython docs examples on csv files, and plenty of related questions here on SO. A good background resource is What every developer should know about unicode and character sets.

Kuldeep Singh Sidhu
  • 3,748
  • 2
  • 12
  • 22