4

My frontend runs on nginx, and I'm serving a bunch of .chunk.js files that are built from react.

Every time I update my frontend, I rebuild the docker image and update the kubernetes deployment. However, some users might still be trying to fetch the old js files.

I would like Google Cloud CDN to serve the stale, cached version of the old files, however it seems that it will only serve stale content in the event of errors or the server being unreachable, not a 404.

Cloud CDN also has something called "negative caching", however that seems to be for deciding how long a 404 is cached.

--> What's the best way to temporarily serve old files on Google Cloud? Can this be done with Cloud CDN?

(Ideally without some funky build process that requires deploying the old files as well)

DemiPixel
  • 1,768
  • 11
  • 19
  • Why would you want to temporarily serve old files? Maybe rethink your deployment? – Martin Zeitler May 04 '21 at 01:25
  • CDNs do not cache objects forever. Your strategy will not work once the object is flushed from the cache ... – John Hanley May 04 '21 at 01:38
  • 1
    @MartinZeitler If a user loads the old index.html, it will try to load the old js files (this is kinda of necessary, and create react app purposefully includes a file's hash in the name to prevent mixed version of js files). Especially with lazy loading, the user might not load a script for at least a couple of seconds, if not a couple of minutes. – DemiPixel May 05 '21 at 02:07
  • 1
    @JohnHanley My goal is not to cache them forever, just for a reasonable period of time (Google can cache up to a week and I don't think even 1 day would be needed). I could likely achieve the same effect by returning 500 instead of 404 (Google will think there's a server error and return the stale file it has cached there for up to a week) but replacing a status of 404 with 500 seems like... very bad practice. – DemiPixel May 05 '21 at 02:10
  • Hey @DemiPixel did you find a solution? I too have this problem where new deployments cause 404 responses for old JS file hash # requests which is a problem for ~ an hour after deployments. – Kevin Danikowski Oct 06 '21 at 14:56

4 Answers4

1

I have this issue as well, if you find a way to set it up on Google CDN please let me know.

Here is my Workaround:

  1. Choose Use origin settings based on Cache-Control headers

  2. Since most browsers cache static JS assets, I reduce my Cache-Control header for .html to be reasonably low or normal, like 5-60 minutes, whereas the javascript files have a much longer cache time, like a week.

Some Context: After deployment, if google serves the old index.html from its CDN cache, the user's browser will request old JS files. If it's time for those JS files to be re-validated, google will see they are now 404s, and send a 404 response instead of the JS file. The workaround above makes sure that the JS files a highly likely to be available in the cache, while the index.html is updated more frequently.

Update: This works... but there appears to be a caveat, if the page isn't a frequently trafficked page, google will eventually return a 404 on the javascript file before the specified time. Even though google docs state it won't get revalidated for 30 days, this appears to be false.

Update 2: Google's Response:

The expiration doc says "Cloud CDN revalidates cached objects that are older than 30 days." It doesn't say that Google won't revalidate prior to 30 days. Things fall out of cache arbitrarily quickly and max-age is just an upper bound.

Kevin Danikowski
  • 4,620
  • 6
  • 41
  • 75
  • 1
    Does this actually work? I've still been having issues—to my understanding, Google actually occasionally validates that the page hasn't been changed. After a couple fetches of a JS file, it will see that it changed (to 404) and update the cache. – DemiPixel Oct 06 '21 at 17:05
  • @DemiPixel I updated the answer accordingly after some testing. – Kevin Danikowski Oct 07 '21 at 18:18
  • 1
    I'll mark as correct for now—this sadly doesn't solve issues with lazy loading, but it doesn't seem like Google has anything built-in to handle this "correctly". Seems like you would need to create some kind of custom solution (e.g. host files of previous couple versions or something like that). – DemiPixel Oct 07 '21 at 18:25
  • 1
    @DemiPixel I'm unsatisfied with this as well, this helps for my highly trafficked pages, but doesn't help for the rest. I'll ask google support at some point in the next month or 2 (I haven't signed up for their support service yet) and see if I can get some better answers. I'll come back and update this after – Kevin Danikowski Oct 07 '21 at 18:44
  • 1
    I've switched over to serving 500 instead of 404. Seems like it randomly gives 500 and then gives 200 with the correct, cached response after? But if I hit it with curl, it always returns the 500. Very strange behavior. – DemiPixel Oct 29 '21 at 11:23
0

You can set Cache-Control: max-stale=<seconds> as a response header in your application. This will allow stale content for a set number of seconds. It's handy if this is happening after new deployments. But still set your html files to either not cache or a short TTL as Kevin said.

Source

Inshaal93
  • 13
  • 4
0

It's unclear if the original problem is possible to solve, but given this is a popular question, I figured I'd share my workaround for static assets (JS, CSS, images, etc, that don't have any special permissions):

I have my CI/CD use a bucket (or "Space" in Digital Ocean) and upload all the JS/CSS/etc files there:

s3cmd sync -P --no-mime-magic --guess-mime-type ./build/assets/ s3://my-bucket-name/frontend/assets/

I then have a script to delete files older than 30 days:

s3cmd ls s3://my-bucket-name/frontend/assets | while read -r line;
  do
    createDate=`echo $line|awk {'print $1" "$2'}`
    createDate=`date -d"$createDate" +%s`
    olderThan=`date -d"-30d" +%s`
    if [[ $createDate -lt $olderThan ]]
      then 
        fileName=`echo $line|awk {'print $4'}`
        echo "Deleting stale $fileName"
        if [[ $fileName != "" ]]
          then
            s3cmd del "$fileName"
        fi
    fi
  done;

Finally, you'll want to setup cdn.your-site.com to point to the bucket/CDN, and update your build process so that assets are accessed using cdn.your-site.com (in Vite, I use a custom renderBuiltUrl that checks for NODE_ENV === 'production' and type === 'asset').

This whole process means that old assets will stay up for 30 days, allowing old sessions to access them. The bucket pricing is extremely minimal and the CDN costs are identical.

DemiPixel
  • 1,768
  • 11
  • 19
-1

The serveWhileStale setting combines both the stale-while-revalidate and the stale-if-error HTTP caching features.

The default, minimum, and maximum values are as follows:

Default: 86,400 seconds (one day)
Minimum: 0 seconds (disables the feature)
Maximum: 604,800 seconds (one week)

Stale content is served up to the specified limit past the cache entry expiration time, which is defined by the max-age, s-max-age, or Expires headers.

See more at : https://cloud.google.com/cdn/docs/serving-stale-content

  • 1
    Indeed, I have `serveWhileStale: 86400` set. However 404 is not considered a server error and when it revalidates it will just realize there's a 404 and start serving the 404! Is there a way to use `serveWhileStale` for 404s? – DemiPixel May 09 '21 at 23:30
  • Have you checked out the cache modes: https://cloud.google.com/cdn/docs/caching#cache-modes ? Maybe FORCE_CACHE_ALL – Nexus Software Systems May 10 '21 at 10:22
  • 1
    I believe I've used that setting before—that mainly controls *what* gets cached, however as soon as it gets revalidated for any reason, it will see it's a 404 and cache the 404. – DemiPixel May 11 '21 at 00:02