10

I have a backend GCS bucket behind a Google Cloud HTTP(S) load balancer with Cloud CDN enabled.

I'm trying to answer these questions based on response headers:

  • was this response served from CDN
  • if so which location/region
  • was this a cache hit/miss

Here are the response headers. Based on cache-control, in theory, this should be cached. However, I don't see an indication of this that can verify CDN works correctly. Similarly all other headers x-goog-* and Server: UploadServer are seem to be coming from GCS server, not CDN.

accept-ranges: bytes
age: 551
alt-svc: clear
cache-control: public, max-age=3600
content-length: 298303
content-type: image/jpeg
date: Wed, 05 Aug 2020 23:07:33 GMT
etag: "f0b6c60f635c784dd7f34ab9c1527867"
expires: Thu, 06 Aug 2020 00:07:33 GMT
last-modified: Wed, 05 Aug 2020 23:07:16 GMT
server: UploadServer
status: 200
X-DNS-Prefetch-Control: off
x-goog-generation: 1596668836233926
x-goog-hash: crc32c=rD4sZw==
x-goog-hash: md5=8LbGD2NceE3X80q5wVJ4Zw==
x-goog-metageneration: 1
x-goog-storage-class: STANDARD
x-goog-stored-content-encoding: identity
x-goog-stored-content-length: 298303
x-guploader-uploadid: AAANsUktJ98kPCHjiR2oBi6N-[...]

For example, Cloudflare provides these response headers:

  • where was the request served: cf-ray: 5be4505beb76bca2-SEA
  • what was the cache status: cf-cache-status: REVALIDATED
  • was my request hitting CDN or my backend directly server: cloudflare
ahmet alp balkan
  • 42,679
  • 38
  • 138
  • 214

2 Answers2

12

There is now a new {cdn_cache_status} variable that you can set in the response: https://cloud.google.com/load-balancing/docs/custom-headers#variables

Using gcloud SDK v309.0.0 or greater:

➜  gcloud beta compute backend-services update my-backend --global \
--enable-cdn \
--custom-response-header='Cache-Status: {cdn_cache_status}' \
--custom-response-header='Cache-ID: {cdn_cache_id}'

Example output:

< HTTP/2 200
< content-type: text/plain; charset=utf-8
< date: Mon, 14 Sep 2020 21:40:05 GMT
< server: Google Frontend
< content-length: 1098
< via: 1.1 google
< cache-control: public, max-age=10
< age: 2
< x-frame-options: DENY
< cache-status: hit
ahmet alp balkan
  • 42,679
  • 38
  • 138
  • 214
elithrar
  • 23,364
  • 10
  • 85
  • 104
4

At the moment, you can not answer the above questions just by looking at the headers on the client side.

One indications if the request was served by cache or not is by the header age, which Cloud CDN will append on the responses.

If you have enabled the cache logging on the HTTP Load Balancer level you can get all the above information from the logs.

More specifically from the fileds:

httpRequest.cacheHit which indicate if the request was served from the cache or not. jsonPayload.cacheId which is the location and cache instance that the cache response was served from.

More detailed information on the above can be found here 1.

  • So is it fair to say that if the `age` header is not present, it was a miss? I've been testing, and I notice that once I've requested a resource, the `age` header remains, although it keeps growing. Then, if I leave it for a bit, and reload, occasionally the `age` header is not present. I assume that means that the GCP CDN refreshed the resource from the origin? – antun Aug 31 '20 at 23:29
  • 1
    That's correct. Usually content that is not accessed very often tend to get evicted from cache regardless of the content's expiration time. So that will be a cache miss and it needs to be fetched from origin. https://cloud.google.com/cdn/docs/overview#eviction – Kostikas Visnia Sep 02 '20 at 11:12
  • It turns out there's now a custom header configurable on the response to indicate the cache status. Check out the approved answer. – ahmet alp balkan Sep 21 '20 at 16:22