2

I am trying to upload a text file to a S3 bucket from a pyhton code using S3FS library to connect and upload to AWS. But getting the below error when I try uploading the content.

warnings.warn(
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/s3fs/core.py", line 112, in _error_wrapper
    return await func(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/aiobotocore/client.py", line 358, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (MalformedXML) when calling the PutObject operation: The XML you provided was not well-formed or did not validate against our published schema

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/ubuntu/airflow/twitter_dag/twitter_etl.py", line 46, in <module>
    run_twitter_etl()
    self.commit()
  File "/usr/local/lib/python3.10/dist-packages/s3fs/core.py", line 2188, in commit
    write_result = self._call_s3(
  File "/usr/local/lib/python3.10/dist-packages/s3fs/core.py", line 2040, in _call_s3
    return self.fs.call_s3(method, self.s3_additional_kwargs, *kwarglist, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/fsspec/asyn.py", line 113, in wrapper
    return sync(self.loop, func, *args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/fsspec/asyn.py", line 98, in sync
    raise return_result
  File "/usr/local/lib/python3.10/dist-packages/fsspec/asyn.py", line 53, in _runner
    result[0] = await coro
  File "/usr/local/lib/python3.10/dist-packages/s3fs/core.py", line 339, in _call_s3
    return await _error_wrapper(
  File "/usr/local/lib/python3.10/dist-packages/s3fs/core.py", line 139, in _error_wrapper
    raise err
OSError: [Errno 22] The XML you provided was not well-formed or did not validate against our published schema

Code:

s3 = s3fs.S3FileSystem(anon=False)
with s3.open(f"<bucket-name>/<filename>.txt", 'w') as f:
    f.write('Hello')

The code was running from a EC2 instance with provided role access. I tested it by running aws s3 ls cmd which returned my list of buckets and was able to access the provided bucket.

coder2911
  • 21
  • 3
  • 1
    According to [this answer](https://stackoverflow.com/a/56275519) by [yardstick17](https://stackoverflow.com/users/4572274/yardstick17) to [Save Dataframe to csv directly to s3 Python](https://stackoverflow.com/q/38154040), this should work in Pandas 0.24.1. What version are you using? Also, what happens if you save your CSV file to a string, then upload the string as shown in [this answer](https://stackoverflow.com/a/40615630)? Maybe there's some specific problem with your DataFrame that is impeding the CSV conversion. – dbc Dec 15 '22 at 20:21
  • Did you define your [credentials](https://s3fs.readthedocs.io/en/latest/#credentials) correctly, as mentioned in https://pandas.pydata.org/docs/user_guide/io.html?highlight=storage_options#reading-writing-remote-files? – dbc Dec 15 '22 at 20:32
  • I am getting the same error, even when I just tried to write a line to a text file instead of a CSV file. All credentials are defined clearly and I am able to verify the successful connection through aws cli. – coder2911 Dec 15 '22 at 23:05
  • 1
    Interesting. Any chance you could [edit] your question to boil it down to a [mcve]? Sounds like the problem is lower level than `to_csv`, and the more extraneous info you can strip out, the more likely it is that someone here can help. – dbc Dec 15 '22 at 23:08
  • Please also provide your Python runtime version, s3fs version, boto3, botocore versions - all are applicable to determining what the issue might be. – Mike Fiedler Dec 20 '22 at 16:35

1 Answers1

0

I was experiencing the same problem, albeit trying to write to an s3-like storage instance. The solution from here solved it:

import s3fs

other_s3fs_kwargs = {<your kwargs here>}

fs = s3fs.S3FileSystem(**other_s3fs_kwargs, s3_additional_kwargs={"ACL": "private"})
sharmu1
  • 68
  • 1
  • 6