We're building REST APIs in which we use ETag for two uses:
- Save bandwidth by allowing the client to avoid reloading a resource (not that important to us)
- Address concurrency issues (lost update problem)
From a practical perspective, I'm wondering what to use to compute the ETag.
Item hash
We're using a hash of the (json dump of the) item object sent in the response. This works fine. It is easy to check on a PUT request: pull the item from DB, compute hash, compare. However, it makes the separation of concerns a bit "leaky": the layer that builds the response from the item is sort of interleaved with the layer responsible for ETag computation. Besides, additional data (response headers) may matter and if they do, sending a 304 just because the item itself didn't change while headers did might not be appropriate.
Response hash
Another approach would be to just hash the response before sending it. Doing this makes the ETag layer much cleaner for the computation part. However, on a PUT request, we can't just pull the item from DB to check the ETag as we don't have the extra data.
The first approach (compute item hash) seems appropriate for case 2 concurrency issues. The second approach (compute payload hash, including metadata, headers) would be appropriate for case 1 save bandwidth.
Putting every bit of the response (including headers) in the request seems right, as every change there may be relevant and require the client to refresh its cache. But I don't know how to manage concurrency on PUT or DELETE requests with such an ETag.
From a practical perspective, should we use item hash or response hash and how can we handle both cases with one of them?