3

I'm looking for information about if and how PHP's http stream wrapper attempts cache files. Can anyone point to information about this?

An answer to this question, Does PHPs fopen function implement some kind of cache?, suggests that the wrapper may attempt to honor cache headers, but I have not found anything in the documentation about this.

Specifically I'm wondering:

  • Will PHP cache files accessed http:// urls.
  • If it does, how long will it keep them?
  • Is there a maximum size for the cache?
  • Is there a maximum size per file that it will cache?
  • Does the cache persist between requests?
  • Out of curiosity, does it cache in memory or on disk? Where?
Community
  • 1
  • 1
bkit
  • 281
  • 1
  • 2
  • 8

3 Answers3

10

Short response: Q1 No. Q2-5 Not applicable.

Longer response: The answers in Does PHPs fopen function implement some kind of cache? are wrong -- at least for Linux and since this PHP codebase is common for WinXXX as well.

This was counter to my understanding so I checked rather than guessing by doing:

$ echo "Hello World" > /var/www/xx.txt
$ php -r 'echo file_get_contents("/var/www/xx.txt");'
Hello World
$ strace -tt -o /tmp/strace  \
> php -r 'echo file_get_contents("http://localhost/xx.txt");'
Hello World

and looking at the system trace log. Here is the relevant cut:

00:15:41.887904 socket(PF_INET6, SOCK_STREAM, IPPROTO_IP) = 3
00:15:41.888029 fcntl(3, F_GETFL)       = 0x2 (flags O_RDWR)
00:15:41.888148 fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0
00:15:41.888265 connect(3, {sa_family=AF_INET6, sin6_port=htons(80), inet_pton(AF_INET6, "::1", &sin6_addr), sin6_flowinfo=0, sin6_scope_id=0}, 28) = -1 EINPROGRESS (Operation now in progress)
00:15:41.888487 poll([{fd=3, events=POLLIN|POLLOUT|POLLERR|POLLHUP}], 1, 60000) = 1 ([{fd=3, revents=POLLOUT}])
00:15:41.888651 getsockopt(3, SOL_SOCKET, SO_ERROR, [0], [4]) = 0
00:15:41.888838 fcntl(3, F_SETFL, O_RDWR) = 0
00:15:41.888975 sendto(3, "GET /xx.txt HTTP/1.0\r\n", 22, MSG_DONTWAIT, NULL, 0) = 22
00:15:41.889172 sendto(3, "Host: localhost\r\n", 17, MSG_DONTWAIT, NULL, 0) = 17
00:15:41.889307 sendto(3, "\r\n", 2, MSG_DONTWAIT, NULL, 0) = 2
00:15:41.889437 poll([{fd=3, events=POLLIN|POLLPRI|POLLERR|POLLHUP}], 1, 0) = 0 (Timeout)
00:15:41.889544 poll([{fd=3, events=POLLIN|POLLERR|POLLHUP}], 1, 60000) = 1 ([{fd=3, revents=POLLIN}])
00:15:41.891066 recvfrom(3, "HTTP/1.1 200 OK\r\nDate: Wed, 15 F"..., 8192, MSG_DONTWAIT, NULL, NULL) = 285
00:15:41.891235 poll([{fd=3, events=POLLIN|POLLERR|POLLHUP}], 1, 60000) = 1 ([{fd=3, revents=POLLIN}])
00:15:41.908909 recvfrom(3, "", 8192, MSG_DONTWAIT, NULL, NULL) = 0
00:15:41.909016 poll([{fd=3, events=POLLIN|POLLERR|POLLHUP}], 1, 60000) = 1 ([{fd=3, revents=POLLIN}])
00:15:41.909108 recvfrom(3, "", 8192, MSG_DONTWAIT, NULL, NULL) = 0
00:15:41.909198 close(3)                = 0
00:15:41.909323 write(1, "Hello World\n", 12) = 12
00:15:41.909532 munmap(0x7ff3866c9000, 528384) = 0
00:15:41.909600 close(2)                = 0
00:15:41.909648 close(1)                = 0

A GET request to localhost, a response, a echo to STDOUT and shutdown. No caching. Nada. Sorry.

Community
  • 1
  • 1
TerryE
  • 10,724
  • 5
  • 26
  • 48
  • 2
    +1, breaking out a tracing tool is far more bad-ass than source diving. – Charles Feb 15 '12 at 00:42
  • I just tested this another way as well. I made a php script that incremented a counter, and another script that accessed it with file_get_contents multiple times in a row. This test also showed that requests were NOT cached. But when testing things like this, there is always the chance of a false negative. Maybe it didn't use a cache this time because of some specific gotcha, but would another time (e.g. maybe caching only works within a single php invokation, and is disabled on localhost). – bkit Feb 15 '12 at 21:46
  • Anyway, given that both our tests showed no caching, and Charles below found no caching related code. I'm thinking its safe to say the http stream wrapper does not cache. Now Charles and TerryE, i'm left with the question: who's answer do i accept? – bkit Feb 15 '12 at 21:53
  • Well you have a choice: You can accept one option based on three _evidence_-based determinations, or you can take the other articulated in the other thread based on "I'm _assuming_ that PHP will be honouring any cache headers the server is responding with". (My italics in the quote.) Your choice :-) – TerryE Feb 15 '12 at 22:46
  • @TerryE I meant "which answer to this question do i mark accepted?". Both you and Charles have pretty good answers, and commented on each others. My thanks to you both. – bkit Feb 16 '12 at 18:39
  • Sorry, I misunderstood. I suspect that both Charles and I appreciate your feedback and the interchange between the three of us more. The tick comes a LONG way second. Toss a coin. The points aren't nearly so important. The main benefit that I get -- apart from sharing my knowledge amongst others is that the process of answering Qs can really challenge me and makes me think about aspects that I would other wise take as granted. OK, 75% of Qs could be avoided by the poster Googling and adding a bit of thought, but at the other end 5-10% are pure gold-dust for those anwsering :-) – TerryE Feb 17 '12 at 00:16
3

The best way to get a definitive answer to this question is to look at the source.

/ext/standard/http_fopen_wrapper.c is where the http fopen wrapper is defined.

There is no caching here whatsoever. Every request is composed of a manually assembled HTTP request made over a socket, not relying at all on any third party code which might add caching unknowingly.

Charles
  • 50,943
  • 13
  • 104
  • 142
  • Maybe strace is more badass, but I still often end up source diving to find out what is _really_ going on. I've been doing that a lot in [mod_write.c](http://svn.apache.org/viewvc/httpd/httpd/trunk/modules/mappers/mod_rewrite.c) a lot recently, so +1 for the source diving also. :-) – TerryE Feb 15 '12 at 13:55
0

I have never encountered a situation where I felt it cached anything. Caching is something you should implement at the application level. There are also other libraries built on top of the stream API out there that may do it for you.

Louis-Philippe Huberdeau
  • 5,341
  • 1
  • 19
  • 22