0

I was reading this answer about how you can append the hash of a resource, such as a JS or CSS file to the filename, so that your browser only downloads it if it's changed, otherwise a cached version is used.

Why would you want to use this over the HTTP header Cache-Control: no-cache?

Using a hash in the filename means you need to make sure the HTML containing the <script src="myscript.someHash.js"></script> tags is never cached.

Why not allow the HTML (and all other resources) to be cacheable using no-cache?

David Klempfner
  • 8,700
  • 20
  • 73
  • 153
  • @KevinChristopherHenry yeah pattern 1. Doesn't it involve making sure the HTML that contains the links to the JS and CSS files never be cached? Is it about pros and cons, the pro is that pattern 1 doesn't involve a definite round trip to the server, but it involves getting the same HTML from the server every time? – David Klempfner Sep 02 '21 at 23:35
  • @KevinChristopherHenry Pattern 1 only works if the HTML is fetched from the server every time. You can't used a cached copy of the HTML otherwise it might contain old hashes of the js filenames. Whereas with pattern 2, instead of downloading the entire HTML, you just get a Not Modified 304 most of the time. – David Klempfner Sep 03 '21 at 01:33
  • 1
    No, you're misunderstanding something. You do the exact same thing with the HTML file in Pattern 1 that you do with all files in Pattern 2, send a conditional request. If the server responds with a `304` then you know it hasn't changed and, by definition, that the filenames it references are not old. – Kevin Christopher Henry Sep 03 '21 at 02:37
  • @KevinChristopherHenry that makes sense. Thanks for the clarification. Btw do you know why you would use `Cache-Control: max-age=31536000` over `Cache-Control: max-age=31536000; must-revalidate`? Without `must-revalidate`, when it expires, it could unnecessarily redownload the same file, whereas with `must-revalidate`, it'll first check if the file is old or not. – David Klempfner Sep 03 '21 at 03:09
  • Happy to help, I'll put that into an answer. As for `must-revalidate`, it's only relevant when the server is otherwise unreachable. As the specification puts it: "The `must-revalidate` directive ought to be used by servers if and only if failure to validate a request on the representation could result in incorrect operation." So, would you rather your users see an outdated page, or a `504` error page? – Kevin Christopher Henry Sep 03 '21 at 03:31
  • @KevinChristopherHenry it seems that `Cache-Control: max-age=31536000; must-revalidate` is superior to `Cache-Control: max-age=31536000`. `must-revalidate` not only doesn't require a full download of the file even though the stale cached version might still be ok, but it also prevents a 504? Is there a reason why you'd ever want to omit `must-revalidate`? – David Klempfner Sep 05 '21 at 22:38
  • 1
    Revalidation will be attempted in any case, all `must-revalidate` affects is what happens in case of an error (say, the server is unreachable). If `must-revalidate` is set, the client will show an error rather than serve the stale page. That's normally not what you want, which is why the standard suggests limiting its use to those cases where "failure to validate a request on the representation could result in incorrect operation". – Kevin Christopher Henry Sep 05 '21 at 23:18
  • @KevinChristopherHenry I completely misunderstood what `must-revalidate` was for. Thanks for the explanation. – David Klempfner Sep 05 '21 at 23:54
  • @KevinChristopherHenry I just realised "must-revalidate doesn't mean "must revalidate", it means the local resource can be used if it's younger than the provided max-age, otherwise it must revalidate." from https://jakearchibald.com/2016/caching-best-practices/ isn't entirely true. It doesn't mention anything about what you've mentioned here with 504. It also makes it sound almost the same as no-cache by itself (the only difference being no-cache doesn't HAVE to revalidate, it just can't use a cached version without revalidation, so it could still request the resource from the server). – David Klempfner Sep 10 '21 at 00:53
  • 1
    I'm not really sure what you're saying here. The `504` bit is implicit in "must"; if the standard says `MUST`, and you can't, then that's an error. As for "almost the same", `no-cache` and `no-cache, must-revalidate` will behave the same way 99% of the time (or whatever the uptime is of your server). In both cases, revalidation will be attempted. – Kevin Christopher Henry Sep 10 '21 at 01:53
  • @KevinChristopherHenry got it now, thanks for the clarification. – David Klempfner Sep 10 '21 at 03:28

1 Answers1

1

What's being discussed here are two different ways to cache pages while still ensuring correctness. This article does a good job of describing them, so I will use its terminology: Pattern 1 uses no-cache on the HTML page, but creates unique filenames for each version of the static assets, and allows them to be cached forever. Pattern 2 just uses no-cache for everything.

They both work. Pattern 1 will be strictly superior from a performance perspective, since only some of its pages have no-cache. The only downside is that the build workflow is more complicated, since you need to generate unique filenames and make sure your HTML page links to them properly. Fortunately, many frameworks support this kind of workflow.

It seems like your main concern is that "you need to make sure the HTML containing the [static links] is never cached." But it can still be cached, it just needs to be revalidated every time it's used. But that's true of every file in Pattern 2, so it isn't a comparative disadvantage.

Kevin Christopher Henry
  • 46,175
  • 7
  • 116
  • 102