8

I knew, that it can be dangerous to pass the items by reference in foreach.

In particular, one must not reuse the variable that was passed by reference, because it affects the $array, like in this example:

$array = ['test'];
foreach ($array as &$item){
    $item = $item;
}
$item = 'modified';
var_dump($array);

array(1) { [0]=> &string(8) "modified" }

Now this here bite me: the content of the array gets modified inside the function should_not_modify, even though I don't pass the $array by value.

function should_not_modify($array){
    foreach($array as &$item){
        $item = 'modified';
    }
}
$array = ['test'];
foreach ($array as &$item){
    $item = (string)$item;
}
should_not_modify($array);
var_dump($array);

array(1) { [0]=> &string(8) "modified" }

I'm tempted to go through my whole codebase and insert unset($item); after each foreach($array => &$item).

But, since this is a big task and introduces a potentially useless line, I would like to know if there is a simple rule to know when foreach($array => &$item) is safe without a unset($item); after it, and when not.

Edit for clarification

I think I understand what happens and why. I also know what is best to do against: foreach($array as &$item){...};unset($item);

I know that this is dangerous after foreach($array as &$item):

  • reuse the variable $item
  • pass the array to a function

My question is: Are there other cases that are dangerous, and can we build an exhaustive list of what is dangerous. Or the other way round: is it possible to describe when it is not dangerous.

Lorenz Meyer
  • 19,166
  • 22
  • 75
  • 121
  • "Dangerous" is subjective. Sometimes it's useful or necessary. – tadman Jan 09 '18 at 17:23
  • 1
    @tadman Yes, like a knife: It is often useful and necessary, but it is dangerous. – Lorenz Meyer Jan 09 '18 at 17:24
  • @poke You are wrong, https://stackoverflow.com/questions/2030906/are-arrays-in-php-passed-by-value-or-by-reference, arrays are passed by value. In my case, the array is passed by value, but the `$item` inside remains a reference after `foreach`. – Lorenz Meyer Jan 09 '18 at 17:28
  • Array passed by value. look there - https://eval.in/932720 Your prolbem is with assignment $item - when $item is reference, you save not value, but reference. pay attention to & in var_dump – splash58 Jan 09 '18 at 17:30
  • It's dangerous when you modify the array structure in the foreach, It can do "spooky things" – ArtisticPhoenix Jan 09 '18 at 17:32
  • 1
    Ok, now I'm really confused. This isn't how I thought PHP worked at all, and it's not actually anything to do with looping by reference. It's an issue with *any* reference to the array existing outside of the function: see https://eval.in/932728 – iainn Jan 09 '18 at 17:36
  • @LorenzMeyer *In my case, the array is passed by value, but the $item inside remains a reference after foreach* it not so - change ibn function item to any other name – splash58 Jan 09 '18 at 17:36
  • @LorenzMeyer If your programming language has no sharp edges it's probably useless. A butter knife might be safe, but good luck cutting down a tree with it. PHP doesn't really have a good way of communicating the intent of a function like C++ does with explicit references. You'll need to convey the intent in either the name of the function, a comment, or some kind of variable naming convention. – tadman Jan 09 '18 at 17:36
  • @LorenzMeyer https://eval.in/932753 - this line `$item =& $array[0];` makes $array[0] as reference!!! I don't understand that !!! – splash58 Jan 09 '18 at 18:05
  • @splash58 Isn't this what the operator `=&` is supposed to do ? – Lorenz Meyer Jan 09 '18 at 18:45
  • I think, that i don't right. `In the first of these, PHP references allow you to make two variables refer to the same content. Meaning, when you do: $a =& $b; it means that $a and $b point to the same content.` - from there http://php.net/manual/en/language.references.whatdo.php. So, both variables are set as reference. But I never thought about – splash58 Jan 09 '18 at 18:49

3 Answers3

6

About foreach

First of all, some (maybe obvious) clarifications about two behaviors of PHP:

  1. foreach($array as $item) will leave the variable $item untouched after the loop. If the variable is a reference, as in foreach($array as &$item), it will "point" to the last element of the array even after the loop.

  2. When a variable is a reference then the assignation, e.g. $item = 'foo'; will change whatever the reference is pointing to, not the variable ($item) itself. This is also true for a subsequent foreach($array2 as $item) which will treat $item as a reference if it has been created as such and therefore will modify whatever the reference is pointing to (the last element of the array used in the previous foreach in this case).

Obviously this is very error prone and that is why you should always unset the reference used in a foreach to ensure following writes do not modify the last element (as in example #10 of the doc for the type array).

About the function that modifies the array

It's worth noting that - as pointed out in a comment by @iainn - the behavior in your example has nothing to do with foreach. The mere existence of a reference to an element of the array will allow this element to be modified. Example:

function should_not_modify($array){
    $array[0] = 'modified';
    $array[1] = 'modified2';
}
$array = ['test', 'test2'];
$item = & $array[0];

should_not_modify($array);
var_dump($array);

Will output:

array(2) {
  [0] =>
  string(8) "modified"
  [1] =>
  string(5) "test2"
}

This is admittedly very suprising but explained in the PHP documentation "What References Do"

Note, however, that references inside arrays are potentially dangerous. Doing a normal (not by reference) assignment with a reference on the right side does not turn the left side into a reference, but references inside arrays are preserved in these normal assignments. This also applies to function calls where the array is passed by value. [...] In other words, the reference behavior of arrays is defined in an element-by-element basis; the reference behavior of individual elements is dissociated from the reference status of the array container.

With the following example (copy/pasted):

/* Assignment of array variables */
$arr = array(1);
$a =& $arr[0]; //$a and $arr[0] are in the same reference set
$arr2 = $arr; //not an assignment-by-reference!
$arr2[0]++;
/* $a == 2, $arr == array(2) */
/* The contents of $arr are changed even though it's not a reference! */

It's important to understand that when creating a reference, for example $a = &$b then both $a and $b are equal. $a is not pointing to $b or vice versa. $a and $b are pointing to the same place.

So when you do $item = & $array[0]; you actually make $array[0] pointing to the same place as $item. Since $item is a global variable, and references inside array are preserved, then modifying $array[0] from anywhere (even from within the function) modifies it globally.

Conclusion

Are there other cases that are dangerous, and can we build an exhaustive list of what is dangerous. Or the other way round: is it possible to describe when it is not dangerous.

I'm going to repeat the quote from the PHP doc again: "references inside arrays are potentially dangerous".

So no, it's not possible to describe when it is not dangerous, because it is never not dangerous. It's too easy to forget that $item has been created as a reference (or that a global reference as been created and not destroyed), and reuse it elsewhere in your code and corrupt the array. This has long been a topic of debate (in this bug for example), and people call it either a bug or a feature...

rlanvin
  • 6,057
  • 2
  • 18
  • 24
0

The accepted answer is the best, but I'd like to give a complement: When is unset($item); not necessary after a foreach($array as &$item) ?

  • $item: if it is never reused after, it cannot harm.

  • $array: the last element is a reference. This always dangerous, for all the reasons already stated.

So what does change that element form being a reference to a value ?

  • the most cited: unlink($item);

  • when $item falls out of scope when the array is returned from a function, then the array becomes 'normal' after being return from the function.

    function test(){
        $array = [1];
        foreach($array as &$item){
            $item = $item;
        }
        var_dump($array);
        return $array;
    }
    $a = test();
    var_dump($a);
    

    array(1) { [0]=> &int(1) }
    array(1) { [0]=> int(1) }

    But beware: if you do anything else before returning, it can bite !

Lorenz Meyer
  • 19,166
  • 22
  • 75
  • 121
-2

You can break the reference by "json decode/encode"

function should_not_modify($array){
    $array = json_decode(json_encode($array),false);
    foreach($array as &$item){
        $item = 'modified';
    }
}
$array = ['test'];
foreach ($array as &$item){
    $item = (string)$item;
}
should_not_modify($array);
var_dump($array);

The question is purely academic, and this is a bit of a hack. But, it's sort of fun, in a stupid programming way.

And of course it outputs:

array(1) {
  [0]=>string(4) "test"
}

As a side the same thing works in JavaScript, which also can give you some wonky-ness from references.

I wish I had a good example, because I've had some "weird" stuff happen, I mean like some quantum entanglement stuff. This one time at a PHP camp, I had a recursive function ( pass by reference ) with a foreach ( pass by reference ) and well it sort of ripped a hole in the space time continuum.

halfer
  • 19,824
  • 17
  • 99
  • 186
ArtisticPhoenix
  • 21,464
  • 2
  • 24
  • 38
  • 3
    Using `unset($item);` after the foreach loop is much better than `json_decode(json_encode())`, both from a performance perspective and for code readability. Also, modifying the function is wrong, because the function should remain generic. – Lorenz Meyer Jan 09 '18 at 17:55
  • 2
    The question is *why* there is a reference within the function, not how to "break" it. – iainn Jan 09 '18 at 17:59
  • @iainn - I fully understand that. But thanks for mentioning it. – ArtisticPhoenix Jan 09 '18 at 18:06