NotImplementedError: Text mode not supported, use mode='wb' and manage bytes in s3fs

Question

I know that there are a similar question but it is more general and not specific of this package. I am saving a pandas dataframe within a Sagemaker Jupyter notebook into a csv in S3 as follow:

df.to_csv('s3://bucket/key/file.csv', index=False)

However I am getting the following error:

NotImplementedError: Text mode not supported, use mode='wb' and manage bytes

The code more or less is that I read a csv from S3, make some preprocessing on it and then saves it to S3. I can read csv from S3 successfully with:

df.read_csv('s3://bucket/key/file.csv')

The object that I am trying to save to S3 is indeed a pandas.core.frame.DataFrame

In the notebook I can see using !pip show package that I have pandas 0.24.2 and s3fs 0.1.5.

What could be the problem?

Perhaps a common problem in Pandas 0.24.2? See: https://stackoverflow.com/questions/38154040/save-dataframe-to-csv-directly-to-s3-python/56275519#comment103490289_56275519 — Alastair McCormack, Feb 03 '20 at 09:12
And https://github.com/pandas-dev/pandas/issues/8508#issuecomment-559248732 suggests it's fixed in 0.25.x — Alastair McCormack, Feb 03 '20 at 09:15

score 4 · Accepted Answer · answered Jan 31 '20 at 17:58

4

Can you Please try

df.to_csv("s3://bucket/key/file.csv", index=False, mode='wb')

It should fix your error. The default mode is w which writes in the file system as text and not bytes. Where as s3 expects the data to be bytes. hence you have to specify mode as wb(write bytes) while writing the dataframe as csv to the filesystem.

answered Jan 31 '20 at 17:58

Hitesh Salavi

145
6

7

Then it throws me the error TypeError: a bytes-like object is required, not 'str'. It is weird because I have been using a lot s3fs to save csvs with pandas to s3 and I have never had to explicitly put mode = 'wb' – Javier Lopez Tomas Feb 03 '20 at 09:04
2

I am getting the same problem today. """ TypeError: a bytes-like object is required, not 'str' """ – Roko Mijic Apr 03 '20 at 08:33
Anyone know what's going on? Did AWS change something? – Roko Mijic Apr 03 '20 at 08:44

Roko Mijic · Answer 2 · 2020-04-03T09:12:32.517

1

I just had this problem.

The cause seems to be having an old version of Pandas. Run

!pip install --upgrade pandas

in your Jupyter Notebook.

The reason you might have an old version of Pandas is if you shut down and then restarted your AWS machine - the AWS environments have older versions of Pandas (That is what happened to me). This problem was fixed last year.

edited Apr 03 '20 at 09:12

answered Apr 03 '20 at 09:03

Roko Mijic

6,655
4
29
36

NotImplementedError: Text mode not supported, use mode='wb' and manage bytes in s3fs

2 Answers2