13

I have a S3 bucket on top of which there is CloudFront CDN.

This S3 bucket is "immutable", which means that once I upload a file there, I never delete it or update it. It is then safe that all clients cache the files coming from S3/CloudFront very aggressively.

Currently, Etags are working great, and clients hit 304 responses most of the time. But getting a 304 response still involve a roundtrip that could be avoided by more aggressive caching.

So I'd like this behavior:

  • CloudFront CDN cache should never get invalidated, because S3 cache never changes. CloudFront does not need to ask again S3 for a file more than once. I think I've successfully configured that using CloudFront distribution settings.

  • CloudFront should serve all files with header Cache-Control: max-age=365000000, immutable (immutable is a new, partially supported value as of 2016)

I don't understand how can I achieve the desired result. Should I handle that at CloudFront or S3 level? I've read some stuff about configuring appropriate header for each S3 file. Isn't there a global setting to serve all files with a custom http header that I could use?

Sebastien Lorber
  • 89,644
  • 67
  • 288
  • 419

1 Answers1

23

Should I handle that at CloudFront or S3 level?

There is currently no global setting for adding custom http headers either in Cloudfront or in S3. To add http headers to objects, they must be set in S3, individually on each object in the bucket. They are stored in the object' metadata - and can be found in the Metadata section for each object in the AWS S3 Console.

Typically, it's easiest to set the headers when adding the object to the bucket - the exact mechanism for doing so depends on which client app you're using, or sdk.

e.g. with the aws cli command you use the --cache-control option:

aws s3 cp test.txt s3://mybucket/test2.txt \
    --cache-control max-age=365000000,immutable

To modify existing objects, the s3cmd utility has a modify option as described in this SO answer: https://stackoverflow.com/a/22522942/6720449

Or you can use the aws s3 command to copy objects back onto themselves modifying the metadata, as explained in this SO answer: https://stackoverflow.com/a/29280730/6720449. e.g. to replace metadata on all objects in a bucket:

aws s3 cp s3://mybucket/ s3://mybucket/ --recursive --metadata-directive REPLACE \
    --cache-control max-age=365000000,immutable

CloudFront CDN cache should never get invalidated

This is quite a stringent requirement - you can't prevent a cloudfront cache from ever getting invalidated. That is, there is no setting that will prevent a Cloudfront invalidation from being created, if the user creating it has sufficient permissions. So, in a roundabout way, you can prevent invalidations by ensuring no users, roles, or groups have permissions to create an invalidation in the distribution using the cloudfront:CreateInvalidation IAM permission - this is possibly not practical.

However, there are a few reasons Cloudfront might choose to invalidate a cache in contravention of the backend's Cache-Control - e.g. if the Maximum TTL setting is set and it is less than the max-age.

Community
  • 1
  • 1
Chris Simon
  • 6,185
  • 1
  • 22
  • 31
  • thanks. I'll go with the modify command. If CloudFront sometimes invalitate it's not good but it's not a big deal either anyway – Sebastien Lorber Nov 08 '16 at 11:54
  • You are quite right that *"CloudFront CDN cache should never get invalidated"* is very stringent. In fact, it is simply not a reasonable expectation. There isn't one monolithic cache -- there is one at each edge, and each edge handling a request for an object fetches it initially from the origin. Objects can also be evicted from any given edge for lack of frequent access ("popularity"). It's a cache... volatile by definition but overall very consistent. See also [Why is Cloudfront evicting objects from cache within mere hours?](http://stackoverflow.com/a/32878535/1695906) – Michael - sqlbot Nov 08 '16 at 16:30
  • If I understood correctly, "immutable" has nothing to do with PURGING an asset or creating an invalidation as stated here. It just means that the asset will never be STALE regardless of what I said on the Cache-Control header. So for CloudFront, it just means that there is no point in sending a REVALIDATION request to the origin, per definition, it could just go on and serve stale content from that edge location and avoid all those revalidation requests hitting the origin. – Igor Escobar Jul 15 '20 at 15:09