1

While optimizing a site for memory, I noticed a leap in memory consumption while including a large number of PHP class files (600+) for a specific purpose. Taking things apart I noticed that including a PHP file (and thus presumably compiling to opcodes) takes about about 50 times more memory than the filesize on disk.

In my case the files on disk are together around 800 kB in size (with indentation and comments, pure class declarations, not many strings), however after including them all, memory consumption was around 40 MB higher.

I measured like this (PHP 5.3.6):

echo memory_get_usage(), "<br>\n";
include($file);
echo memory_get_usage(), "<br>\n";

Within a loop over the 600 files I can watch memory consumption grow from basically zero to 40 MB. (There is no autoloader loading additonal classes, or any global code or constructor code that is executed immediately, it's really the pure include only.)

Is this normal behaviour? I assumed opcodes are more compact than pure source code (stripping out all spaces and comments, or having for example just one or two instruction bytes instead of a "foreach" string etc.)?

If this is normal, is there a way to optimize it? (I assume using an opcode cache would just spare me the compile time, not the actual memory consumption?)

Wolfgang Stengel
  • 2,867
  • 1
  • 17
  • 22
  • You are just including a file. It's hard to imagine something *more* normal. But what is the opcode assumption based on? – Jon Jan 31 '13 at 22:16
  • I would expect memory consumption to grow after includes, but not by that much. The opcode assumption is based on a basic knowledge about how opcodes are supposed to work, having concise binary instruction _codes_ instead of long syntax _strings_, which I feel should be less memory intensive. – Wolfgang Stengel Jan 31 '13 at 22:25
  • What is in the files you are including? – Green Black Jan 31 '13 at 22:34
  • @John: Pure class definitions, here's an example: http://pastebin.com/YZefLmmp – Wolfgang Stengel Jan 31 '13 at 22:37
  • There's a lot more to the compilation from the PHP source to opcodes than what you mention. The main reason higher level programming languages exist is to allow more to be done with less code. This means that one function in the PHP source could perform dozens of smaller operations that are each represented by one opcode. – G-Nugget Jan 31 '13 at 22:42
  • Quick test for me shows 600 class definitions taking up 24kb. Did you consider perhaps it's what your classes do that's taking up the memory? – Leigh Jan 31 '13 at 22:42
  • @G-Nugget: That's exactly what I mean, the opcodes themselves should not take up that much memory, the functionality is in the executing engine. – Wolfgang Stengel Jan 31 '13 at 22:44
  • @Leigh: Nothing is executed, it's included only. No objects are created or functions called. – Wolfgang Stengel Jan 31 '13 at 22:45
  • @WolfgangStengel What I meant is that the PHP source for `foo($bar);` takes 9 bytes, but it could be compiled into 20 bytes of opcodes. The point is: compiled code is usually smaller, but the compilation process isn't that simple. – G-Nugget Jan 31 '13 at 22:50
  • @G-Nugget: That's sort of my point: Why would the opcodes for a simple function call be 20 bytes long? Is there a way to be sure that this is the issue? – Wolfgang Stengel Jan 31 '13 at 22:53
  • @WolfgangStengel Here's an example: processors don't have opcodes for multiplication or division. To do what is a simple operation in source code (`2/1`) takes several steps involving subtracting and comparing several times. The machine code for that is much larger than the source code. I'd write out some sample pseudo code for the process, but I'm running low on characters. – G-Nugget Jan 31 '13 at 22:59
  • 1
    @WolfgangStengel Check out the `vld` package from PECL – Leigh Jan 31 '13 at 23:00
  • @G-Nugget: But we're not talking about machine code here. PHP has a single opcode for division named DIV. – Wolfgang Stengel Jan 31 '13 at 23:01
  • @WolfgangStengel There is something you are overlooking. The opcodes are only part of what is generated. The opcodes tell the program what to do at a specific point during execution, but what is it operating on? It's operating on internal structures, class definitions, function definitions, variable definitions. While it may be a single byte opcode to assign a value to a variable, the variable in memory is represented by a comparatively large `zval` internally. Classes have data that represents the inheritance chain, how magic methods are hooked up, etc. Stuff the opcodes act upon. – Leigh Jan 31 '13 at 23:07
  • Here's a nice article. [How big are PHP arrays (and values) really? (Hint: BIG!)](http://nikic.github.com/2011/12/12/How-big-are-PHP-arrays-really-Hint-BIG.html). – Leigh Jan 31 '13 at 23:12
  • @Leigh: I understand, but I find it still surprising that it is bigger by a factor of 50 compared to the source. There are no constructors or any other code executed, so we only have the static properties of the classes as ZVALs in memory, which should be minimal compared to 40 MB. – Wolfgang Stengel Jan 31 '13 at 23:15
  • @Leigh: I'll give vld a spin, to see if it's actually the opcodes or something else, thanks for the idea. – Wolfgang Stengel Jan 31 '13 at 23:19

1 Answers1

1

Apparently that's just the way it is.

I've retested this from the ground up:

  • Include an empty zero length file: 784 bytes memory consumption increase
  • Include an empty class X { } definition: 2128 bytes
  • Include a class with one empty method: 2816 bytes
  • Include a class with two empty methods: 3504 bytes

The filesize of the include file is under 150 bytes in all tests.

Wolfgang Stengel
  • 2,867
  • 1
  • 17
  • 22