Do browsers not follow the HTTP spec's Cache-Control correctly?

Question

I am somewhat new to web development and have noticed an issue, Browsers seem to not respect the Cache-Control header, I have it set to no-cache, no-store, must-revalidate but yet many of my clients have a cache to begin with (which no-store should prevent according to https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Cache-Control#no-store) and the cache is used rather than revalidating with the server leading to broken pages when I change a JS script referenced in a page, only after I tell them to refresh without cache does the browser then fetch the new file but for the browsers to be compliant with the HTTP protocol and spec, don't they need to respect the no-store policy or are none of the major browsers properly compliant with the HTTP protocol/spec and why haven't they been fixed so we don't need workaround solutions like query strings appended to files or using the file's hash or last modification date?

Yeah, my previous headers didn't have any Cache-Control flags but how long is the max-age by default until it fetches a new copy from the server. Although it is weird that query strings seem to be the preferred solution everywhere I've looked if there is a better way with Cache-Control flags? Do you know of any reason it has become the seemingly preferred solution? I have thought about it, do people not have means of setting the Cache-Control flags themselves or is it that if you used no Cache-Control flags in the past that it will take a long time for the browser to get the new flags? — JordanPlayz158, Mar 22 '22 at 11:15

Kevin Christopher Henry · Accepted Answer · 2022-03-24T15:56:07.060

You initially served the resource without cache headers. In that case, the specification allows the client to choose the cache time itself:

Since origin servers do not always provide explicit expiration times, a cache MAY assign a heuristic expiration time when an explicit time is not specified, employing algorithms that use other header field values (such as the Last-Modified time) to estimate a plausible expiration time.

Different browsers will use different algorithms, but in any case it probably won't be very long. Your problem might have already resolved itself.

As for query strings, I think your confusion comes from conflating at least three distinct issues. One is the HTTP protocol mechanism for communicating cache policies. That is covered in RFC 7234 and mainly involves the proper use of the Cache-Control response header.

A separate issue is what cache strategy to use. That is, which resources should be cached and for how long? There are different ways to approach this, my suggestion would be to follow the best practices discussed here.

Finally, there's how to fix your mistake if you communicated the wrong cache policy and now need an already-cached resource to be ignored or invalidated. In that case, if possible, you could just use a different resource (i.e. change the name). Adding query strings is sometimes suggested here, but it's not a great solution since the standard does not forbid clients from caching resources with query strings.

Getting back to your question, you can temporarily fix your mistake (missing Cache-Control headers) by changing the name of the linked resource, or just by waiting a short time for the heuristic expiration time to pass. Longer term, you should decide how you want your different resources to be cached, and then use Cache-Control to communicate that intent to the browser.

https://stackoverflow.com/questions/118884/how-to-force-the-browser-to-reload-cached-css-and-javascript-files this one has no solutions with Cache-Control headers, this person was using cache control headers and people told him to use query strings or the other method of hash names of the js files or rand numbers https://stackoverflow.com/questions/49282489/client-side-cache-on-css-js, doesn't recommend query string but recommends an equivalent basically https://stackoverflow.com/questions/37204296/cache-invalidation-using-the-query-string-bad-practice and the other 2 show a google article 1/2 — JordanPlayz158, Mar 24 '22 at 11:41
which shows you how to use query strings for cache expiration and the next recommends the time method so there are a lot of posts/even google seemingly recommending the wrong practice it sounds like and even steering people away from the best practice in the posts, so these posts (and those were only a few) can lead people like myself astray when I was trying to find the proper way to do it by looking up "stackoverflow don't store cache on client" which does bring up the Cache-Control on the top post but doesn't specify the important caveat of what you mentioned 2/2 — JordanPlayz158, Mar 24 '22 at 11:49
@JordanPlayz158: I edited the question with some additional information. To be blunt, you can't expect to understand a complex technical topic by doing a google search and reading StackOverflow answers. If you want to understand how HTTP caching works, I suggest skimming through RFC 7234, which is reasonably short and clear, as well as being comprehensive and correct. — Kevin Christopher Henry, Mar 24 '22 at 16:00
Thank you and yes, It is unrealistic to try to understand the way HTTP caching works through StackOverflow answers or using google. — JordanPlayz158, Mar 26 '22 at 18:20

Do browsers not follow the HTTP spec's Cache-Control correctly?

1 Answers1