1

Using AWS CloudFormation I set up a CloudFront distribution to serve content from a private S3 bucket. I do not have the bucket configured as an S3 website — rather, I'm using the latest-and-greatest technique: Origin Access Control (OAC). See Restricting access to an Amazon S3 origin. I'm using Route53 and Certificate Manager to serve the CloudFront distribution over TLS with a custom domain example.com.

So far the basics are working fine for URLs that reference objects that exist in the S3 bucket. I can access https://example.com/foobar.html just fine, for example. But if I request a file that does not exist, such as https://example.com/missing.html, CloudFront returns a 403 "Access Denied" instead of a 404 "Not Found".

I can make a wild guess that some communication between CloudFront and S3 makes CloudFront think its access is denied if the object doesn't exist. (Still that doesn't explain why.) Is this a bug? Is this expected behavior? How are we expected to use CloudFront+S3+OAC with this odd behavior—does AWS expect us to set up a CloudFront custom error response to convert 403 to 404? (But why would we want to assume all access denied errors in CloudFormation really indicate a missing object on S3?)

Note that I found various other CloudFront questions related to 403, but none related to an OAC configuration, and most of the other questions were regarding a CloudFront distribution that always returned 403, not just for missing files.

Garret Wilson
  • 18,219
  • 30
  • 144
  • 272

2 Answers2

3

Unless you have the s3:ListBucket permission, S3 returns the 403 Forbidden status and the AccessDenied error for missing objects, by design. This is because without s3:ListBucket, the principal doesn't have permission to know whether the object is missing or if it exists but they aren't allowed access.

Note that unlike s3:GetObject, an object-level permission where the resource ARN is arn:aws:s3:::bucket-name/*, s3:ListBucket is a bucket-level permission, so the resource is arn:aws:s3:::bucket-name without the trailing /*.

After updating the bucket policy, you should find that the 404s work as expected, but you also need to set the Cloudfront Default Root Object for the distribution to whatever you want returned when / is requested, otherwise a bucket listing will be returned, which is probably not what you want.

Also be aware of the Error Caching Minimum TTL, which causes CloudFront to cache those 403s for 5 minutes, separate from the other TTL settings for the cache behavior.

Michael - sqlbot
  • 169,571
  • 25
  • 353
  • 427
  • 1
    Ah, something like [this other answer](https://stackoverflow.com/a/38775442). That did the trick! Thank you for the spot-on answer as well as the additional useful details. – Garret Wilson Apr 27 '23 at 22:00
  • "… you also need to set the Cloudfront Default Root Object for the distribution …, otherwise a bucket listing will be returned, which is probably not what you want." Ah, now I understand why the `DefaultCacheBehavior` `DefaultRootObject` only applies to `/`, because its primary purpose must be to prevent listing the bucket contents—and the bucket namespace is flat, not hierarchical, so without this setting it would result in a listing of _all_ bucket names. – Garret Wilson Apr 27 '23 at 22:03
  • @GarretWilson it wouldn't list all the buckets, it would just list the contents of the one the distribution is pointing to. The official purpose of `DefaultRootObject` is to allow you to treat `/` as `/index.html` or something similar and replace the bucket listing with something else. If I remember right, it's been available since before the S3 web site hosting feature was rolled out, with more proper index document support, a few years ago. – Michael - sqlbot Apr 28 '23 at 19:45
  • It's long been my belief that the name of the `s3:ListBucket` privilege is essentially a design mistake that it's way too late to fix. The more correct and intuitive name of this privilege would have been `s3:ListObjects`, because that's what it allows. Similarly, `s3:ListAllMyBuckets` would have been more correctly called `s3:ListBuckets`, since that's the one that allows fetching a list of buckets in your account... but it wasn't. – Michael - sqlbot Apr 28 '23 at 19:46
  • "… it wouldn't list all the buckets …" Yes I understand. By "all bucket names" I meant "the names of all objects in the bucket" (not the best choice of words on my part). Because the namespace is flat, this will list _all_ object in the bucket; furthermore a default root object will help `/`, but not `/foo/` or `/foo/bar/`. But a default root object will hide the directory listing of all the bucket objects. Anyway see my answer for a more comprehensive solution using functions, based upon the information you alerted me to. Thanks again. – Garret Wilson Apr 28 '23 at 19:50
0

Michael's answer is exactly correct. I'm leaving a separate answer here just to flesh out the details of exactly what needs to be done. I'll use CloudFormation for illustration.

Here's an example of updating the S3 bucket policy to include s3:GetObject:

  BucketPolicy:
    Type: AWS::S3::BucketPolicy
    Properties:
      Bucket: !Ref Bucket
      PolicyDocument:
        Version: 2012-10-17
        Statement:
          - Effect: Allow
            Principal:
              Service: cloudfront.amazonaws.com
            Action: s3:ListBucket
            Resource: !Sub "${Bucket.Arn}"
            Condition:
              StringEquals:
                AWS:SourceArn: !Sub "arn:aws:cloudfront::${AWS::AccountId}:distribution/${CloudFrontDistribution}"
          - Effect: Allow
            Principal:
              Service: cloudfront.amazonaws.com
            Action: s3:GetObject
            Resource: !Sub "${Bucket.Arn}/*"
            Condition:
              StringEquals:
                AWS:SourceArn: !Sub "arn:aws:cloudfront::${AWS::AccountId}:distribution/${CloudFrontDistribution}"

As Michael mentioned, you need to set DefaultRootObject to something or CloudFront will return a directory listing when / is requested. But you still need to indicate a default page if any sub-path root is indicated, such as /foo/bar/. And what if you don't want to indicate a default page? You can address both issues using CloudFront functions.

First set up a parameter for the default page:

Parameters:
  DefaultPage:
    Description: The default page, such as `index.html`.
    Type: String
    Default: index.html

Then set up the function:

  DefaultPageFunction:
    Type: AWS::CloudFront::Function
    Properties:
      Name: default-page
      AutoPublish: true
      FunctionConfig:
        Comment: Convert root requests to default page.
        Runtime: cloudfront-js-1.0
      FunctionCode: !Sub |
        function handler(event) {
          var request = event.request;
          var uri = request.uri;
          if (uri.endsWith("/")) {
            var defaultPage = "${DefaultPage}";
            if(defaultPage) {
              request.uri += defaultPage;
            } else {
              return {
                statusCode: 403,
                statusDescription: "Forbidden"
              };
            }
          }
          return request;
        }

This function handles both cases based upon the DefaultPage parameter. If it's set to a non-empty string, it will add that to any …/ path. If DefaultPage is empty, then it will return an HTTP 403 Forbidden for every …/ path.

Finally add it to your AWS::CloudFront::Distribution DefaultCacheBehavior:

          …
          FunctionAssociations:
            - EventType: viewer-request
              FunctionARN: !GetAtt DefaultPageFunction.FunctionMetadata.FunctionARN
        DefaultRootObject: !Ref DefaultPage

I threw in the DefaultRootObject for good measure, even though the function should take care of that case as well.

Warning! There seems to be a bug with CloudFormation that will get the parameter "stuck" with one value, so double-check that CloudFormation deployed the correct value if you modify DefaultPage after deploying. See the ticket I just opened: CloudFormation deploy doesn't update !Sub contents with new parameter value. #989

Garret Wilson
  • 18,219
  • 30
  • 144
  • 272