12

Using PHP how can I accurately test that a remote website supports the "If-Modified-Since" HTTP header.

From what I have read, if the remote file you GET has been modified since the date specified in the header request - it should return a 200 OK status. If it hasn't been modified, it should return a 304 Not Modified.

Therefore my question is, what if the server doesn't support "If-Modified-Since" but still returns a 200 OK?

There are a few tools out there that check if your website supports "If-Modified-Since" so I guess I'm asking how they work.

Edit:

I have performed some testing using Curl, sending the following;

curl_setopt($ch, CURLOPT_HTTPHEADER, array("If-Modified-Since: ".gmdate('D, d M Y H:i:s \G\M\T',time()+60*60*60*60)));
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_MAXREDIRS, 5);
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_NOBODY, true);
curl_setopt($ch, CURLOPT_AUTOREFERER, true);
curl_setopt($ch, CURLOPT_FORBID_REUSE, true);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 4);
curl_setopt($ch, CURLOPT_TIMEOUT, 4);

i.e. a date in the future google.com returns;

HTTP/1.0 304 Not Modified
Date: Fri, 05 Feb 2010 16:11:54 GMT
Server: gws
X-XSS-Protection: 0
X-Cache: MISS from .
Via: 1.0 .:80 (squid)
Connection: close

and if I send;

curl_setopt($ch, CURLOPT_HTTPHEADER, array("If-Modified-Since: ".gmdate('D, d M Y H:i:s \G\M\T',time()-60*60*60*60)));
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_MAXREDIRS, 5);
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_NOBODY, true);
curl_setopt($ch, CURLOPT_AUTOREFERER, true);
curl_setopt($ch, CURLOPT_FORBID_REUSE, true);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 4);
curl_setopt($ch, CURLOPT_TIMEOUT, 4);

i.e. a date in the past, google.com returns;

HTTP/1.0 200 OK
Date: Fri, 05 Feb 2010 16:09:12 GMT
Expires: -1
Cache-Control: private, max-age=0
Content-Type: text/html; charset=ISO-8859-1
Server: gws
X-XSS-Protection: 0
X-Cache: MISS from .
Via: 1.0 .:80 (squid)
Connection: close

If I then send both to bbc.co.uk (which doesn't support it);

The future one returns;

HTTP/1.1 200 OK
Date: Fri, 05 Feb 2010 16:12:51 GMT
Server: Apache
Set-Cookie: BBC-UID=84bb66bc648318e367bdca3ad1d48cf627005b54f090f211a2182074b4ed92c40ForbSoft%20Web%20Diagnostics%20%28URL%20Validator%29; expires=Tue, 04-Feb-14 16:12:51 GMT; path=/; domain=bbc.co.uk;
Accept-Ranges: bytes
Cache-Control: max-age=0
Expires: Fri, 05 Feb 2010 16:12:51 GMT
Pragma: no-cache
Content-Length: 111677
Content-Type: text/html

The date in the past returns;

HTTP/1.1 200 OK
Date: Fri, 05 Feb 2010 16:14:01 GMT
Server: Apache
Set-Cookie: BBC-UID=841b66ec44232cd91e81e88a014a3c5e50ed4e20c0e07174c4ff59675cd2fa210ForbSoft%20Web%20Diagnostics%20%28URL%20Validator%29; expires=Tue, 04-Feb-14 16:14:01 GMT; path=/; domain=bbc.co.uk;
Accept-Ranges: bytes
Cache-Control: max-age=0
Expires: Fri, 05 Feb 2010 16:14:01 GMT
Pragma: no-cache
Content-Length: 111672
Content-Type: text/html

So my question still stands.

J.C
  • 1,409
  • 2
  • 19
  • 32
  • Please post the curl commands you're using, I'm testing command line and all I'm getting is 200s no matter what header I send – adamJLev Feb 05 '10 at 16:25
  • If the server doesn’t support *If-Modified-Since* but still returns the 200 status code, then it’s as you would have sent the request without *If-Modified-Since* and the server responds with the 200 status code. There is no difference. 200 is 200, “The request has succeeded.” – Gumbo Feb 05 '10 at 16:28
  • @Infinity - I have added the curl commands/options in my original post above. – J.C Feb 08 '10 at 11:43
  • 1
    The question is wrong per-se because the requested location/entity has to support this header, not the website or the server (as pointed out by @Infinity). So testing a single URL does tell you nothing about the rest of the server's requestable entities. – hurikhan77 Feb 08 '10 at 16:11

3 Answers3

8

I have performed some testing on this and it appears to work as follows;

If you send an If-Modified-Since header with a date that is in the past (5 mins previous to the current time should do it) then sites such as google.com, w3.org, mattcutts.com will return a "HTTP/1.1 304 Not Modified" header. Sites such as yahoo.com, bbc.co.uk and stackoverflow.com always return a "HTTP/1.1 200 OK".

The "Last-Modified" header has nothing to do with "If-Modified-Since" because the whole point of sending back a "HTTP/1.1 304 Not Modified" header is that you don't have to send the body with it (thus saving bandwidth - which is the whole point behind this).

Therefore, the answer to my question is that if a site doesn't return a "HTTP/1.1 304 Not Modified" header when you send an "If-Modified-Since 5 mins ago" header, the site doesn't support the "If-Modified-Since" request properly.

If I am incorrect, please say so and provide testing to show.

Edit: I forgot to add that a good test is to make a normal HEAD request to the domain (e.g. w3.org), grab the "Last Modified" date and then make another request with "If-Modified-Since:". This will test that both the "Last Modified" value and "If-Modified-Since" request are supported. Please Note: just because the server sends back a "Last Modified" date doesn't mean it supports "If-Modified-Since"

J.C
  • 1,409
  • 2
  • 19
  • 32
  • I'm glad you found a solution, but I did mention that in my answer as a more "practical" way to infer the capability of the server, versus the more "theoretical" header approach. Quoting myself: "Maybe you can just do two requests, one followed by another, sending a If-Modified-Since header, and then verify if the second request is a 304 or a 200." – adamJLev Feb 08 '10 at 17:21
  • @Infinity - If you read my answer you'll see yours is barking up a different tree, but I can see what you mean by the "practical" approach, which is ultimately where I took it. – J.C Feb 09 '10 at 09:18
  • First of all, each browser handles this concept differently. Secondly, using Chrome v22, the server must send a "Last-Modified" header for Chrome to send a subsequent "If-Modified-Since" header. This answer is not correct. – Matty F Oct 15 '12 at 23:30
5

If the entity returns a "Last-Modified" header, then it supports it. Makes sense really.

More info: http://httpd.apache.org/docs/2.2/caching.html (A Brief Guide to Conditional Requests)

Obviously only static pages/files will have that header. With dynamic content (asp, php, etc) there is no way to know by the headers (unless the site handlers caching manually, e.g. like this), and the entity may or may not support If-Modified-Since, from my experience.

Maybe you can just do two requests, one followed by another, sending a If-Modified-Since header, and then verify if the second request is a 304 or a 200.

EDIT- hurikhan77 points out a important note, and it's that, for example testing the root of the site for this capability, does not guarantee that the rest of the site does/doesn't support this too.

adamJLev
  • 13,713
  • 11
  • 60
  • 65
1

regarding the first answer above I'd like to note that conditional requests make as much sense on dynamic content as they do on static content. If the code that generates the dynamic content knows that the backend entity (e.g. database item) has not changed it should send a 304 upon a conditional request.

Jan

Jan Algermissen
  • 4,930
  • 4
  • 26
  • 39