13

I'm serving a set of resources through content negotiation. Concretely, any URL can be represented in different formats, depending on the client's Accept header.

An example of this can be seen at Facebook:

  • curl -H "Accept: application/json" http://graph.facebook.com/daft-punk
    results in JSON
  • curl -H "Accept: text/turtle" http://graph.facebook.com/daft-punk
    results in Turtle

I'm looking for a CDN that caches content based on URL and the client's Accept header.

Example of what goes wrong

CloudFlare doesn't support this: if one client asks for HTML, then all subsequent requests to that URL receive the HTML representation, regardless of their preferences. Others have similar issues.

For example, if I would place CloudFlare over graph.facebook.com(and configure it to cache “extensionless” resources, which it does not by default), then it would behave incorrectly:

  1. I ask for http://graph.facebook.com/daft-punk in JSON through curl;
    in response, CloudFlare asks the JSON original from the server, caches it, and serves it.
  2. I ask for http://graph.facebook.com/daft-punk through my browser (thus in HTML);
    in response CloudFlare sends the cached JSON (!) representation, even though the original server would have sent the HTML version.

What would be needed instead

The correct behavior would be that CloudFlare asks the server again, since the second client had a different Accept header. After this, requests with similar Accept headers can be served from cache.

Which CDN solutions support content-negotiation, and also cache negotiated content?
So note that only respecting Accept is not enough; negotiated responses should be cached too.



PS1: It's easy to make your own caching servers support it. For instance, for nginx:

proxy_cache_key "$scheme$host$request_uri$http_accept";

Note how the client's Accept header is part of the key that indexes the cache. I want that on CDN.


PS2: It is not an option to use different URLs for different representations. My application is in the Linked Data domain, where URLs play an important role for identification.

Community
  • 1
  • 1
Ruben Verborgh
  • 3,545
  • 2
  • 31
  • 43
  • The first thing that has to be fixed are the facebook server headers I think. They miss `Vary: Accept` which would tell the cache that the accept header influences the response. – letmaik Nov 11 '15 at 17:36
  • @neo True, but still, it doesn't work for servers that correctly set the `Vary` header. – Ruben Verborgh Nov 11 '15 at 21:25

2 Answers2

0

Seems maxcdn still can set up custom nginx rules for content negotiation (despite what their faq says) - http://blog.maxcdn.com/how-to-reduce-image-size-with-webp-automagically/#comment-1048561182

-2

I can't think of any way we would impact this at all at this time. We don't, for example, cache HTML by default. Have you actually seen an issue with this? Have you opened a support ticket?

damoncloudflare
  • 2,079
  • 13
  • 9
  • 1
    I had created a support request (#68719) before I started this thread. The staff was very helpful, but they confirmed that CloudFlare does not support caching content-negotiated responses. They also said that it's "not on the near-term roadmap to add that". The default is extension-based, so files with the `.html` _extension_ are not cached by default indeed (but this can be changed). Issue: try caching any site with content-negotiation. Only the first-served representation is cached, and repeatedly served regardless of new clients' `Accept` header. Would this improve in the future? – Ruben Verborgh Dec 03 '13 at 10:09
  • Question updated with example of what goes wrong with CloudFlare (and others I've tested). – Ruben Verborgh Dec 03 '13 at 10:17