4

I am trying to move my python code to Airflow. I have the following code snippet:

    s3_client = boto3.client('s3', 
                        region_name="us-west-2", 
                        aws_access_key_id=aws_access_key_id,
                        aws_secret_access_key=aws_secret_access_key)

I am trying to recreate this s3_client using Aiflow's s3 hook and s3 connection but cant find a way to do it in any documentation without specifying the aws_access_key_id and the aws_secret_access_key directly in code.

Any help would be appreciated

SelectStarFrom
  • 141
  • 1
  • 2
  • 8

1 Answers1

3

You need to define aws connection in Admin -> Connections or with cli (see docs). Once the connection defined you can use it in S3Hook. Your connection object can be set as:

Conn Id: <your_choice_of_conn_id_name>
Conn Type: Amazon Web Services
Login: <aws_access_key>
Password: <aws_secret_key>
Extra: {"region_name": "us-west-2"}

enter image description here

In Airflow the hooks wrap a python package. Thus if your code uses hook there shouldn't be a reason to import boto3 directly.

Elad Kalif
  • 14,110
  • 2
  • 17
  • 49
  • So just so that I understand this correctly. I just need to setup a AWS connection( not S3 connection?) and boto3 wont need creds for creating an s3 client? – SelectStarFrom Feb 03 '21 at 23:07
  • 2
    @SelectStarFrom boto3 gets the creds from S3Hook which gets it from AwsBaseHook which gets it from the connection you define. When you interact with the S3Hook you just give it the conn_id you defined. I suggest that you will create a connection and write a simple code that download a file from S3 using the S3Hook. you will see that your code doesn't mention boto3. You don't need to interact with boto directly. You interact with the hook. If you need functionality from boto that the hook doesn't have you simply add this to the hook. – Elad Kalif Feb 03 '21 at 23:22
  • 1
    @SelectStarFrom see this example https://github.com/apache/airflow/blob/master/airflow/providers/amazon/aws/example_dags/example_s3_bucket.py – Elad Kalif Feb 03 '21 at 23:23
  • thank you. That worked perfectly. I was just thinking of using boto3 out of habit and didnt realize I can use the s3hook itself. – SelectStarFrom Feb 04 '21 at 04:07
  • @Elad There is a conn type of S3 in the Airflow connections. I'd recommend using that instead of AWS Conn since it's more explicit – Gabe Feb 05 '21 at 21:48
  • 1
    @Gabe the s3 conn is deprecated. When you use it you will see deprecation notice in the task log – Elad Kalif Feb 06 '21 at 00:43