5

I found out today that an arraylist I passed to a function gets changed when I remove a value from the arraylist within the function. The code below seems to imply that passing is happening by reference. Why would that be? Is this by design or some kind of bug? (I am using v4 on Win 8.1)

function myfunction {
    param (
        [System.Collections.ArrayList]$local
    )
        "`$local: " + $local.count
        "removing 1 from `$local"
        $local.RemoveAt(0)     
        "`$local:" + $local.count       
}

[System.Collections.ArrayList]$names=(Get-Content c:\temp\names.txt)

"`$names: " + $names.count
 myfunction -local $names      
"`$names: " + $names.count

RESULT:

$names: 16
$local: 16
removing 1 from $local
$local:15
$names: 15
Adil Hindistan
  • 6,351
  • 4
  • 25
  • 28

3 Answers3

8

This is by design, and is not a bug. Arrays, collections and hash tables are passed by ref. The reason this behaves differently than adding or removing from an array is that operation creates a new array inside the function scope. Any time you create a new variable inside the function, it is scoped to the function. $local.RemoveAt(0) doesn't create a new $local, it just calls a function of the existing $local in the parent script. If you want the function to operate on it's own $local, you need to explicitly create a new one inside the function.

Because it's by ref, this won't work:

 $local = $local

You'll still be referencing $local in the parent scope. But you can use the clone() method to create a new copy of it

  function testlocal {
   param ([collections.arraylist]$local)
   $local = $local.Clone()
   $local.RemoveAt(0)
   $local
 }

$local = [collections.arraylist](1,2,3)

'Testing function arraylist'    
testlocal $local
''
'Testing local arraylist'
$local


Testing function arraylist
2
3

Testing local arraylist
1
2
3
BenMorel
  • 34,448
  • 50
  • 182
  • 322
mjolinor
  • 66,130
  • 7
  • 114
  • 135
  • I see what's happening. Especially b/c of internal array manipulation, I was under the wrong impression that things are passed by value but just tested the code with [hashtable] and sure enough that is the case. – Adil Hindistan Jan 25 '14 at 18:41
  • #^$*!!!!! After looking at PS scope articles on Microsoft websites for the creation of a complicated nesting script, *many* hours later finally came to the conclusion that not all "variables" were passed into functions by value. (at least) http://technet.microsoft.com/en-us/library/hh847849.aspx doesn't mention any special rules for arrays, collections and hash tables, not even in the "basic rules" section! ... and http://blogs.msdn.com/b/powershell/archive/2007/04/14/controlling-the-scope-of-variables.aspx doesn't make a single mention of arrays, collections, or hash tables. VERY ANNOYING MS! – user66001 Jan 04 '16 at 04:18
  • 1
    ++ for the `.Clone()` tip, but your explanation is not quite correct: shadowing of variables from the parent (ancestral) scope does not come into play in the way you state: A parameter declaration is an implicit _local_ variable declaration. That is, on _entering_ `testlocal()` `$local` is already a local variable containing whatever was passed as the parameter - it never sees an ancestral variable of the same name; if a reference to an instance of a _reference_ type was passed (as in this case), then `$local` receives a copy of the _reference_, and thus still points to the original instance. – mklement0 Jan 23 '16 at 05:27
  • To underscore my previous point: try `function foo([string] $local) { "\`$local inside foo: $local" }; $local = 'hi'; foo; foo bar; $local` – mklement0 Jan 23 '16 at 23:40
  • You seem determined to beat this to death. The apparent confusion about inheritance is because the OP has used the same variable name ($local) in both the parent scope and the function scope, and then passed the reference from one to the other. – mjolinor Jan 24 '16 at 00:32
  • I see two distinct variables in the OP's question: `$name` in the calling scope, and `$local` inside the function (corresponding to parameter `-local`). That said, my point was that even with the _same_ name inheritance does _not_ come into play, as explained in my first comment, and as demonstrated in my second one. I invite you to look at my answer, and perhaps we can arrive at a shared understanding (which is not the same as beating something to death). – mklement0 Jan 24 '16 at 17:14
  • @user66001: The problem at hand is not related to _scoping_, so it's not surprising that the linked articles offer no help. Instead, the problem relates to _by-value / by-reference parameter passing logic in .NET in general_ (PowerShell behaves no differently in that respect). In short: a variable containing an instance of a _reference_ type - such as `[Collections.ArrayList]` - is effectively passed _by reference_. – mklement0 Jan 24 '16 at 22:24
  • 1
    Okay, I misread the question, so the references to $local in the parent scope aren't really relevant to this particular problem. The main points that array lists are indeed passed by reference, and the means to get a new copy to use in the function scope is the .clone() method remain. – mjolinor Jan 24 '16 at 22:33
  • @mklement0 - Which problem at hand? The OP's, or mine? Regardless, while I don't proclaim to be an expert on scoping / by-ref/by-value passing, it would seem to be that by-ref variable could never traverse into a child scope "fully" unless it was actually doing by-value. – user66001 Jan 25 '16 at 05:00
  • Agreed, the `.Clone()` part is the relevant part of your answer, but the rest - even though it explains a subtlety of PowerShell use that's important to understand in general - simply doesn't apply here, and is likely to cause confusion for future readers. Can you please update your answer accordingly? – mklement0 Jan 25 '16 at 13:49
  • @user66001: By problem at hand I meant the OP's problem. I don't understand your comment, but I invite you to read my answer and, if you disagree, you can tell me there. – mklement0 Jan 25 '16 at 13:52
7

mjolinor's helpful answer provides the crucial pointer: To have the function operate on a copy of the input ArrayList, it must be cloned via .Clone() first.

Unfortunately, the explanation offered there for why this is required is not correct:[1]

No PowerShell-specific variable behavior comes into play; the behavior is fundamental to the .NET framework itself, which underlies PowerShell:

Variables are technically passed by value (by default[2]), but what that means depends on the variable value's type:

  • For value types, for which variables contain the data directly, a copy of the actual data is made.
  • For reference types, for which variables only contain a reference to the data, a copy of the reference is made, resulting in effective by-reference passing.

Therefore, in the case at hand, because [System.Collections.ArrayList] is a reference type (verify with -not [System.Collections.ArrayList].IsValueType), parameter $local by design points to the very same ArrayList instance as variable $names in the calling scope.

Unfortunately, PowerShell can obscure what's happening by cloning objects behind the scenes with certain operations:

  • Using += to append to an array ([System.Object[]]):

     $a = 1, 2, 3  # creates an instance of reference type [Object[]]
     $b = $a       # $b and $a now point to the SAME array
     $a += 4       # creates a NEW instance; $a now points to a DIFFERENT array.
    
  • Using += to append to a [System.Collections.ArrayList] instance:

    • While in the case of an array ([System.Object[]) a new instance must be created - because arrays are by definition of fixed size - PowerShell unfortunately quietly converts a [System.Collections.ArrayList] instance to an array when using += and therefore obviously also creates a new object, even though [System.Collections.ArrayList] can be grown, namely with the .Add() method.

      $al = [Collections.ArrayList] @(1, 2, 3)  # creates an ArrayList
      $b = $al       # $b and $al now point to the SAME ArrayList
      $al += 4       # !! creates a NEW object of type [Object[]]
      # By contrast, this would NOT happen with: $al.Add(4)
      
  • Destructuring an array:

     $a = 1, 2, 3     # creates an instance of reference type [Object[]]
     $first, $a = $a  # creates a NEW instance
    

[1] mjolinor's misconception is around inheriting / shadowing of variables from the parent (ancestral) scope: A parameter declaration is implicitly a local variable declaration. That is, on entering testlocal() $local is already a local variable containing whatever was passed as the parameter - it never sees an ancestral variable of the same name. The following snippet demonstrates this: function foo([string] $local) { "`$local inside foo: $local" }; $local = 'hi'; "`$local in calling scope: $local"; foo; foo 'bar' - foo() never sees the calling scope's definition of $local.

[2] Note that some .NET languages (e.g., ref in C#) and even PowerShell itself ([ref]) also allow passing a variable by reference, so that the local parameter is effectively just an alias for the calling scope's variable, but this feature is unrelated to the value/reference-type dichotomy.

mklement0
  • 382,024
  • 64
  • 607
  • 775
-1

If you pass myfunction -local $names, then $names = $local and it looks like your function is to delete $local @ position 1 (affecting $names, remember). The next time you read the variable it has been modified, so fix is to double grep $names variable returning the count to 16.

[System.Collections.ArrayList]$names=(Get-Content c:\temp\names.txt)

"`$names: " + $names.count
 myfunction -local $names      
# $names variable is altered now, so re-run grep.
[System.Collections.ArrayList]$names=(Get-Content c:\temp\names.txt)
"`$names: " + $names.count

Or re-write / pipe your function as a skip instead of on-the-fly deletion ???

Select-Object -Skip 1

Here is some further proof - note the write-host commands will display both variables being edited at the same time by the function.

function myfunction {
    param (
        [System.Collections.ArrayList]$local
    )
Write-Host $names
Write-Host $local
        "`$local: " + $local.count
        "removing 1 from `$local"
        $local.RemoveAt(0)     
        "`$local: " + $local.count 
Write-Host $names
Write-Host $local      
}

[System.Collections.ArrayList]$names=(Get-Content c:\temp\names.txt)

"`$names: " + $names.count
 myfunction -local $names  
"`$names: minus 1 - bork bork bork " + $names.count
[System.Collections.ArrayList]$names=(Get-Content c:\temp\names.txt)    
"`$names: " + $names.count
Knuckle-Dragger
  • 6,644
  • 4
  • 26
  • 41
  • nifty little array article http://powershell.com/cs/blogs/tips/archive/2008/12/04/manipulating-arrays-effectively.aspx – Knuckle-Dragger Jan 25 '14 at 04:52
  • Not sure I understand what you are saying. So, Why is the change I made within function is causing the number of items in my original array to come down. – Adil Hindistan Jan 25 '14 at 05:10
  • Because your function deletes it. I guess what you are not getting is that when you delete the item from $local.RemoveAt(0), you are actually deleting it from $names at the same time. I'm gonna assume this is by design, if you need a why. I'll example you some proof in a second. – Knuckle-Dragger Jan 25 '14 at 06:00
  • Well, I know that's what's happening and the code to show exactly that. I am asking why it is happening. Why is local scope affecting script scope as if I passed it by ref. It would not happen if this was a regular array or hash. – Adil Hindistan Jan 25 '14 at 15:03
  • Having just come across this again, @AdilHindistan: Actually, it _would_ happen with regular arrays and hash tables, as the following commands demonstrate: `$arr = 1, 2, 3; & { param($arrParam) $arrParam[0] = 'new' } $arr; $arr` and `$hash = @{ foo = 1 }; & { param($hashParam) $hashParam['foo'] = 'new' } $hash; $hash`. in short: if the variable value is an instance of a _reference type_ other than `[string]`, the callee gets an object _reference_, which they can potentially modify. The currently accepted answer not only doesn't tell this story, it contains outright misinformation. – mklement0 Feb 06 '20 at 20:49
  • 1
    @mklement0 funny, I never saw your answer before but I understand what you are saying and changed the answer. Thank you! – Adil Hindistan Feb 12 '20 at 21:37