2

Following this tutorial: https://www.usgs.gov/media/files/landsat-cloud-direct-access-requester-pays-tutorial

import boto3
import rasterio as rio
from matplotlib.pyplot import imshow
from rasterio.session import AWSSession

s3 = boto3.client('s3', aws_access_key_id=AWS_KEY_ID,
                  aws_secret_access_key=AWS_SECRET)

resources = boto3.resource('s3', aws_access_key_id=AWS_KEY_ID,
                           aws_secret_access_key=AWS_SECRET)

aws_session = AWSSession(boto3.Session())

cog = 's3://usgs-landsat/collection02/level-2/standard/oli-tirs/2020/026/027/LC08_L2SP_026027_20200827_20200906_02_T1/LC08_L2SP_026027_20200827_20200906_02_T1_SR_B2.TIF'

with rio.Env(aws_session):
    with rio.open(cog) as src:
        profile = src.profile
        arr = src.read(1)
imshow(arr)

I get the below error:

rasterio.errors.RasterioIOError: '/vsis3/usgs-landsat/collection02/level-2/standard/oli-tirs/2020/026/027/LC08_L2SP_026027_20200827_20200906_02_T1/LC08_L2SP_026027_20200827_20200906_02_T1_SR_B2.TIF' does not exist in the file system, and is not recognized as a supported dataset name.
In AWS CloudShell if I run: ``` aws s3 ls s3://usgs-landsat/collection02/level-2/standard/oli-tirs/2020/026/027/LC08_L2SP_026027_20200827_20200906_02_T1/ ```

I get:

An error occurred (AccessDenied) when calling the ListObjectsV2 operation: Access Denied

I ran the cloudshell commands in an EC2 instance, same errors.

I needed to specify that I am requester its right in the documentation, this works:

aws s3 ls s3://usgs-landsat/collection02/level-2/standard/oli-tirs/2020/026/027/LC08_L2SP_026027_20200827_20200906_02_T1/ --request-payer requ
ester

Using boto3 still doesn't work.

I have admin permissions on the user I was running boto3 with. Got the same error in CloudShell as both the boto user and root. I have used the access key and secret key before and it works fine for downloading from the "landsat-pds" bucket (only has L8 images) and the "sentinel-s2-l1c" bucket. Only seems to be an issue with the "usgs-landsat" bucket (https://registry.opendata.aws/usgs-landsat/)

Also tried accessing the usgs-landsat bucket with s3.list_objects:

landsat = resources.Bucket("usgs-landsat")
all_objects = s3.list_objects(Bucket = 'usgs-landsat')

Get a similar error:

botocore.exceptions.ClientError: An error occurred (AccessDenied) when calling the ListObjects operation: Access Denied

After looking at other solutions some users found:

os.environ["AWS_REQUEST_PAYER"] = "requester"
os.environ["CURL_CA_BUNDLE"] = "/etc/ssl/certs/ca-certificates.crt"

To fix their issue, it hasn't worked for me.

isaacm
  • 23
  • 1
  • 4

2 Answers2

1

As you have correctly pointed out, the usgs-landsat S3 bucket is requester pays, so you need to configure rasterio correctly in order to handle that.

As you can see here, rasterio.session.AWSSession has a requester_pays argument that you can set to True in order to do this.

I can also point out that the lines:

s3 = boto3.client('s3', aws_access_key_id=AWS_KEY_ID,
                  aws_secret_access_key=AWS_SECRET)

resources = boto3.resource('s3', aws_access_key_id=AWS_KEY_ID,
                           aws_secret_access_key=AWS_SECRET)

in your code snippet are not needed since you do not reuse the s3 and resources variables later on.

In fact, if your credentials are correctly located in your ~/.aws/ folder - which can be done by running the command-line utility aws configure provided by the awscli python package (see documentation) - you do not need to import boto3 at all, rasterio does it for you.

Your code snippet can therefore be modified to:

import rasterio as rio
from matplotlib.pyplot import imshow
from rasterio.session import AWSSession

aws_session = AWSSession(requester_pays=True)

cog = 's3://usgs-landsat/collection02/level-2/standard/oli-tirs/2020/026/027/LC08_L2SP_026027_20200827_20200906_02_T1/LC08_L2SP_026027_20200827_20200906_02_T1_SR_B2.TIF'


with rio.Env(aws_session):
    with rio.open(cog) as src:
        profile = src.profile
        arr = src.read(1)
imshow(arr)

which runs correctly on my machine.

guampi
  • 306
  • 1
  • 8
  • This work! This issue was my .aws/config file wasn't set up correctly thanks! I am bit confused why I could download sentinel-2 data though, since the bucket is also requester pay? https://registry.opendata.aws/sentinel-2/ – isaacm Jun 03 '21 at 17:18
  • Good to hear! Could you please mark the answer as approved? I don't know why you were able to access the sentinel-2-l1c bucket as it is indeed requester pays as well – guampi Jun 03 '21 at 17:39
-1

This worked for me

s3sr = boto3.resource('s3')
bucket='usgs-landsat'
prefix = 'collection02/'
keys_list = []
paginator = s3sr.meta.client.get_paginator('list_objects_v2')
for page in  paginator.paginate(Bucket=bucket, Prefix=prefix, Delimiter='/', RequestPayer='requester'):
    keys = [content['Key'] for content in page.get('Contents')]
    keys_list.extend(keys)
len(keys_list)

# keys_list 
['collection02/catalog.json',
 'collection02/landsat-c2l1.json',
 'collection02/landsat-c2l2-sr.json',
 'collection02/landsat-c2l2-st.json',
 'collection02/landsat-c2l2alb-bt.json',
 'collection02/landsat-c2l2alb-sr.json',
 'collection02/landsat-c2l2alb-st.json',
 'collection02/landsat-c2l2alb-ta.json']

# getting the catalog.json
response = boto3.client('s3').get_object(Bucket=bucket, Key='collection02/catalog.json', RequestPayer='requester')
jsondata = response['Body'].read().decode()
Jonathan Leon
  • 5,440
  • 2
  • 6
  • 14