1

I have a question about the copy one write optimization in PHP that I believe is distinct from this and this question marked as a duplicate. I also do not feel the question is addressed here, either this.

In my observation, the copy on write optimization appears to work differently with arrays than with objects. With arrays, it seems that if anything at all changes, no matter how deep in the array, a copy takes place. With objects, it seems that the copy on write only looks at a shallow level.

Consider:

class B{
  var $b;
  function __construct($x) { $this->b = $x; }
}

$w = [new B(0)];
$x = $w; // $x references $w 
$x[0]->b = 1; // now $x does not reference $w 

$y = new B(0);
$z = $y; // $z references $y 
$z->b = 1; // $z still references $y 

I make these confirmations on referencing based on passing $w, $x, $y, and $z into the debug_zval_dump function.

Can someone explain why this is the case? The answer in the second link I provide mentioned "as long as no single byte is changed" and perhaps that is what I need to understand better. My mental model for PHP objects had always been that they are effectively pointers. And the pointer to B in "$x[0]->b = 1" is not changed so I would view that as not a single byte changing.

I also don't see why objects behave differently than arrays in this regard.

Is this behavior properly documented somewhere other than having to read or post on Stack Overflow? The PHP.net manual is quite unhelpful about more theoretical details of the language. With JavaScript, for example, one can view the ECMAScript Standard. That sort of thing doesn't seem to exist for PHP.

Your Common Sense
  • 156,878
  • 40
  • 214
  • 345
user137364
  • 305
  • 1
  • 7
  • "now $x does not reference $w" is not true. – Your Common Sense May 17 '22 at 17:37
  • In a nuthsell, objects aren't copied on write at all. They are passed by reference. Looks like it's going to be closed as [this dupe](https://stackoverflow.com/questions/185934/how-do-i-create-a-copy-of-an-object-in-php) – Your Common Sense May 17 '22 at 17:38
  • `$x` does not reference the same array as `$w`. But `$x[0]` and `$w[0]` reference the same object. – Barmar May 17 '22 at 17:42
  • @YourCommonSense why do you say "now $x does not reference $w" is false? If you run debug_zval_dump($w) right after "$x=$w", the ref count is 3 (counting the one in the debug function). If you run it right after "$x[0]->b = 1", the ref count is 2 (counting the one in the debug function). How does this not suggest that $x stopped referencing $w? – user137364 May 17 '22 at 17:42
  • ["unhelpful" documentation](https://www.php.net/manual/en/language.oop5.references.php) – Your Common Sense May 17 '22 at 17:43
  • @Barmar: yes, that I get. I'm just not understanding why the behavior differs between objects and arrays. – user137364 May 17 '22 at 17:43
  • Because that's the design of PHP. Objects are intended to be passed around between functions and variables and maintain state, while arrays are supposed to be copied (although you can use reference variables to share them). – Barmar May 17 '22 at 17:45
  • @YourCommonSense: I think the semantics of objects being passed by reference is correct. But it doesn't seem to change the fact that PHP still seems to make $z a reference to $y at first and then $z is only a copy (but indeed this still translates to being a reference, up to the point that $z is assigned to). This question is not a duplicate. The answer you linked to describes a practical question on copying arrays. I care about the underlying theory (which may be less practical in this case, I admit). – user137364 May 17 '22 at 17:48
  • Well, my bad, I overlooked the fact that $w for some reason is not an object. I don't know why did you decide to make it array. Probably to confuse yourself even more. Why arrays are in this question? Why copy-on-write is mentioned regarding objects? – Your Common Sense May 17 '22 at 17:52
  • @Barmar: that makes sense. I guess I'm surprised that PHP is smart enough to see the changes made to an array at such a deep level and make a copy – user137364 May 17 '22 at 17:53
  • It doesn't need to make a copy in the first place, I think it's just a missing optimization. – Barmar May 17 '22 at 17:55
  • @YourCommonSense: my question is more academic and less practical. I just notice that with arrays, the copy on write seems to look very deep within the array and subobjects. With an object, aside from direct assignment, the references seem to remain intact. I was hoping to see the documentation on that – user137364 May 17 '22 at 17:56
  • I think this is considered an implementation detail, not something that has visible effects that need to be documented. – Barmar May 17 '22 at 17:57
  • @Barmar: so you're saying a "smarter" PHP engine or set of language specs would not result in making a copy in the first case? – user137364 May 17 '22 at 17:57
  • 1
    Can you please just address my questions and not your theoretical musings? Why arrays are in this question? Where do you see copy on write with objects? – Your Common Sense May 17 '22 at 17:58
  • @YourCommonSense: well, I would think of something along these lines as a copy on write: "$a = new B(0); $b = $a; $b = new B(1);" this does not result in $a changing. But I fully admit my question may have been ill-formed as this may not constitute a "copy on write" in this circumstance. It just seems like originally $b referenced $a and then on assignment $b cannot reference $a anymore so the assignment was effectively done on a copy, not treating $b as a reference. – user137364 May 17 '22 at 18:02
  • @user137364 Yes. There's no need to make a copy of the array when you modify the object that it refers to. I'm surprised that's happening. – Barmar May 17 '22 at 18:03
  • @Barmar: thanks, that clears things up! – user137364 May 17 '22 at 18:06
  • `$b = new B(1);` What "copy" do you see here? Copy of what? Here you just assigned some value to $b. There is never a copy, least "copy on write" – Your Common Sense May 17 '22 at 18:14
  • @YourCommonSense: then I retract what I said about "copy on write" for objects. I shouldn't have used that term. – user137364 May 17 '22 at 18:20
  • So now you can read the manual link provided above, to learn how objects are "copied" – Your Common Sense May 17 '22 at 18:25
  • @YourCommonSense: the manual link you provided does make sense except for when it says that $c and $d are both references. The manual suggests a variable really stores an identifier (in the case of an object) so the notion of reference is less useful. But it proceeds to make $d a reference to $c (okay, so $d is a reference) but also says $c is a reference (and doesn't say it is storing an identifier). Would it be more correct to say: $c stores an identifier and $d is a reference to $c? – user137364 May 17 '22 at 18:40
  • Yes, I think so. I think this article (and its followup) could satisfy your academic interest, https://www.npopov.com/2015/05/05/Internal-value-representation-in-PHP-7-part-1.html – Your Common Sense May 17 '22 at 19:09
  • @YourCommonSense: thank you. I think that's just what I need! – user137364 May 17 '22 at 19:13

1 Answers1

0

You can simply use var_dump() to see object IDs instead of trying to infer based on refcount.

Eg:

class B{
  var $b;
  function __construct($x) { $this->b = $x; }
}

$w = [new B(0)];
$x = $w;
$x[0]->b = 1;

var_dump($w, $x);

$y = new B(0);
$z = $y;
$z->b = 1;

var_dump($y, $z);

Output:

array(1) {
  [0]=>
  object(B)#1 (1) {
    ["b"]=>
    int(1)
  }
}
array(1) {
  [0]=>
  object(B)#1 (1) {
    ["b"]=>
    int(1)
  }
}

object(B)#2 (1) {
  ["b"]=>
  int(1)
}
object(B)#2 (1) {
  ["b"]=>
  int(1)
}

Here we can see that object(B)#1 is preserved through the array assignment.

Sammitch
  • 30,782
  • 7
  • 50
  • 77