0

I'm using S3Hook in my task to download files from s3 bucket on DigitalOcean spaces. Here is an example of credentials which are perfectry working with boto3, but causing errors when used in S3Hook:

[s3_bucket]
default_region = fra1
default_endpoint=https://fra1.digitaloceanspaces.com
default_bucket=storage-data
bucket_access_key=F7QTVFMWJF73U75IB26D
bucket_secret_key=mysecret

This is how I filled the connection form in Admin->Connections: enter image description here

Here is what I see in task's .log file:

ClientError: An error occurred (403) when calling the HeadObject operation: Forbidden

So, I guess, the connection form is wrong. What is the proper way to fill all S3 params properly? (i.e. key, secret, bucket, host, region, etc.)

arturkuchynski
  • 790
  • 1
  • 9
  • 27

2 Answers2

1

Moving host variable to Extra did the trick for me.

For some reason, airflow is unable to establish connection in case of custom S3 host (different from AWS, like DigitalOcean) if It's not in Extra vars.

Also, region_name can be removed from Extra in case like mine.

arturkuchynski
  • 790
  • 1
  • 9
  • 27
0

To get this working with Airflow 2.1.0 on Digital Ocean Spaces, I had to add the aws_conn_id here:

s3_client = S3Hook(aws_conn_id='123.ams3.digitaloceanspaces.com')

Fill in the Schema as the bucket name, Login (key) and Password (secret) and then the Extra field in the UI contains the region and host:

{"host": "https://ams3.digitaloceanspaces.com","region_name": "ams3"}
Tomp
  • 35
  • 1
  • 5