0

I have an html file that can be accessed by browsing to

https://localhost:8080/contextRoot/home.html

This html uses 2 images:

<img src="https://localhost:8080/contextRoot/image1.jpg">
<img src="https://localhost:8080/static/images/image2.jpg">

The first image is packaged in my war file and loads fine. When I reload the page, it is fetched from cache instead of re-downloading it. I see this in the developer tools of my browser.

The second image also loads fine, but it is downloaded every time the page is requested. It is never cached. It uses a special java servlet to handle what we call static content:

<servlet>
    <servlet-name>staticFileServlet</servlet-name>
    <servlet-class>com.company.web.file.StaticFileServlet</servlet-class>
</servlet>

<servlet-mapping>
    <servlet-name>staticFileServlet</servlet-name>
    <url-pattern>/static/*</url-pattern>
</servlet-mapping>

This servlet searches the computer's disk for the folder C://images/ for a file called image1 and serves it by writing those bytes to the response, while also heading a content-type header to the response (so the browser knows what kind of file it is receiving).

I think I might have to add additional headers to explain the browser that this content should be cached. Can the Cache-control header help me here? However, I thought browsers were smart enough to cache requests regardless of what headers I (don't) use.

Here are the response headers for an image that is succesfully cached (served by being the war file)

Accept-Ranges:bytes
Content-Length:354
Content-Type:image/gif
Date:Mon, 04 Jan 2016 09:43:42 GMT
ETag:W/"354-1449227028000"
Last-Modified:Fri, 04 Dec 2015 11:03:48 GMT
Server:Apache-Coyote/1.1

Here is an example of an image that is served by the servlet and isn't cached:

Cache-Control:max-age:864000
Content-Type:image/jpeg
Date:Mon, 04 Jan 2016 13:59:04 GMT
Server:Apache-Coyote/1.1
Transfer-Encoding:chunked

EDIT: my files are behind an SSL connection, which could cause the deny of caching. However, I'm certain it is not the server denying this caching because

  1. It is caching some images.
  2. There no headers (pregma, etag, cache-control...) set on the response.

Does google chrome automatically refuse caching from (some) ssl connections?

user1884155
  • 3,616
  • 4
  • 55
  • 108
  • 1
    Open your developer's console, then reload the page, either after clearing your cache or use a shortcut to refresh the page and ignore the cache (e.g. Chrome on Windows is CTRL+F5 or SHIFT+CTRL+F5). For both images, check the response headers to see if there is something different. – Über Lem Jan 04 '16 at 13:14
  • The image that is being has the header Last-Modified, whereas the image that doesn't cache doesn't have this header. I don't see any headers, neither in request nor response that deal with caching such cache-control, pragma etc. – user1884155 Jan 04 '16 at 13:44
  • I don't know which server is being used (not unimportant information to provide by the way), but it might be that it is configured to add no-caching headers to servlet calls. – Gimby Jan 04 '16 at 13:56
  • I'm using JBoss, which uses Apache-Coyote/1.1 under the hood. I don't see headers appear in the request/response when I use chrome's developer tools, shouldn't they be in there if apache added them automatically? – user1884155 Jan 04 '16 at 14:02

2 Answers2

1

A servlet per se is meant to generate content programmatically. Therefore its typically desired caching behavior is not to cache the responses. However, you have the possibility to control caching behavior, by adding a cache-control header as you mentioned and supplying a 304 status on subsequent requests if the required request headers are present. As images can be created on the fly with varying content, so this is not a good indicator for caching behavior. See also:

http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html

A good way to implement this in a single place is by using a filter, see e.g.

https://github.com/samaxes/javaee-cache-filter

Remigius Stalder
  • 1,921
  • 2
  • 26
  • 31
  • I understand servlets COULD serve dynamic content, but my servlet doesn't and I don't see how the website can possibly know what's happening with the request it launches as to alter/disable caching. I'm having troubles with the cache-control header, I can't seem to turn on the cache with it, only turn it off or limit the duration of the cache? – user1884155 Jan 04 '16 at 13:46
  • did you look at the response headers? – Remigius Stalder Jan 04 '16 at 13:47
  • Yes I did. The image that works has an ETAG header, however I also have an example of gif file with an etag header that doesn't work. I now added the Cache-control header to the images that didn't work, with a max-age of 3600*24 (1 day). I see this header in my browser's developer tools, so it's there. Alas, nothing happens and the image is still reloaded every time I request the given html page. – user1884155 Jan 04 '16 at 13:57
  • See https://github.com/samaxes/javaee-cache-filter/wiki/CacheFilter - it looks like it's not only about the cache-control header, but the server also adds a 304 status (which I have verified just now by looking at a static resource's headers on Chrome). This can all be achieved by using the indicated filter. – Remigius Stalder Jan 04 '16 at 14:03
  • I just checked the source code of cachefilter (on the github link you provided), and I don't see it setting the 304 status anywhere. Can you give me the relevant line number so I can see what code you are talking about? Additionally, the images that are being cached succesfully AND those that are not being cached both have an http status of 200 (ok) in my developer tools, so I'm not sure what the http status has to do with this? – user1884155 Jan 04 '16 at 14:09
  • The resource mentions "the server will return a 304 Not Modified" - not the filter, the RFC specifies the interplay of client (i.e. browser) and server. Of course, the server should not return a 304 on the first, but only on subsequent requests as long as the content has not expired, and obviously without a request to the server the client cannot know whether the content was modified or not. Did you try the filter at all or just look at the code? – Remigius Stalder Jan 04 '16 at 14:22
  • I looked at the filter, then hardcoded what the filter does: add the cache-control header. This doesn't work. I don't see the additional benefit of the filter to be honest, it seems more useful for DISABLING the cache. I cannot add the 304 status to my own servlet, this is something the apache container should do. But it doesn't do that, I guess because it doesn't consider servlets to be a static resource. – user1884155 Jan 04 '16 at 14:29
  • Servlets can do basically all to a response that can be done for static resources. In fact, typical servlet containers use servlets to serve static resources (which display correct caching behavior). I won't go as far as providing a sample webapp for you, but I am inclined to believe on first approximation that the filter does what it claims. but otherwise feel free to study the RFC and implement what it says (which I am also inclined to believe to work). – Remigius Stalder Jan 04 '16 at 14:33
  • I'm sorry, I meant: I can add the status, but I do not want to programmatically implement it so that it doesn't set 304 on the first request but it does a 304 on subsequent resuest. This is error-prone implementation of something I would expect to happen automatically. I will implement the filter fully, but I predict it will do nothing. It just sets some headers, have you looked at the source code? – user1884155 Jan 04 '16 at 14:37
  • see also the following comment in the filter's source code: "By default, some servers (e.g. Tomcat) will set headers on any SSL content to deny caching. Omitting the Pragma header takes care of user-agents implementing HTTP/1.0." (coyote is Tomcat afaik) – Remigius Stalder Jan 04 '16 at 14:37
  • I noticed that comment, I checked the headers in my browser and there are no pragma headers. So apache doesn't do this behaviour (I edited my original post to clarify this). Also, I found the reason why it doesn't work (thanks to the filter): it's the EXPIRES header in addition to the DATE header that makes google chrome cache the content. The actual requests all have http status 200 (not 304). Thanks for the help – user1884155 Jan 04 '16 at 14:59
0

Make sure the following headers are set on your response to tell the browser identical subsequent requests should be cached for 1 day:

  • Cache-Control:public, max-age=36000
  • Content-Type:image/jpeg
  • Date:Mon, 04 Jan 2016 14:56:52 GMT
  • Expires:Tue, 05 Jan 2016 00:56:52 GMT

You can change the amount of caching time by altering the value of the headers.

user1884155
  • 3,616
  • 4
  • 55
  • 108