1

I am new to PowerShell and there is a weird behavior I cannot explain. I call a function that returns a [System.Collections.ArrayList] but when I print my variable that receives the content of the array, if I have one value(for example: logXXX_20210222_075234355.txt), then I get 0 logXXX_20210222_075234355.txt. The value 0 gets added for some reason as if it has the index of the value. If I have 4 values, it will look like this:

0 1 2 3 logXXX_20210222_075234315.txt logXXX_20210225_090407364.txt logXXX_20210204_120318221.txt logXXX_20210129_122737751.txt

Can anyone help?

Here is a simple code that does that:

function returnAnArray{
   $arrayToReturn =[System.Collections.ArrayList]::new()
   $arrayToReturn.Add('logICM_20210222_075234315.txt')
   return $arrayToReturn
}

$fileNames = returnAnArray
Write-Host $fileNames
0 logICM_20210222_075234315.txt
mklement0
  • 382,024
  • 64
  • 607
  • 775
  • 1
    What is the purpose of this function? It appears you could probably replace it with `@()` – Mathias R. Jessen Mar 04 '21 at 12:50
  • ArrayList is obsolete: https://github.com/dotnet/platform-compat/blob/master/docs/DE0006.md – Alistair Wall Mar 04 '21 at 13:09
  • In short: any output - be it from a command or a .NET method call - that is neither captured nor redirected (sent through the pipeline or to a file) is _implicitly output_. To simply _discard_ such output, it's best to use `$null = ...`; see [this answer](https://stackoverflow.com/a/55665963/45375). If you don't discard such output, it becomes part of a function's "return value" (stream of output objects), for instance. – mklement0 Mar 04 '21 at 14:13
  • 1
    You don't need to explicitly define an array in the function. PowerShell's implicit output behaviour creates an array automatically, when you output more than one value. See https://stackoverflow.com/a/55665963/7571258 – zett42 Mar 04 '21 at 14:17
  • Note that even with the accidental-output problem corrected, your function doesn't output an _array list_ as such: It outputs the _elements_ of that array list, because PowerShell's pipeline automatically _enumerates_ collections; to output a collection _as a whole_, use `return , $arrayToReturn` (wrap it in an aux. single-element array). Without that, if the array list happens to have just _one_ element, that one element is output as-is, _not_ wrapped in an array. See [this answer](https://stackoverflow.com/a/62489498/45375) for more information. – mklement0 Mar 04 '21 at 15:51

2 Answers2

7

It's characteristic of the ArrayList class to output the index on .Add(...). However, PowerShell returns all output, which will cause it to intermingle the index numbers with the true or other intended output.

My favorite solution is to simply cast the the output from the .Add(...) method to [Void]:

function returnAnArray{
    $arrayToReturn = [System.Collections.ArrayList]::new()
    [Void]$arrayToReturn.Add('logICM_20210222_075234315.txt')
    return $arrayToReturn
 }

You can also use Out-Null for this purpose but in many cases it doesn't perform as well.

Another method is to assign it to $null like:

function returnAnArray{
    $arrayToReturn = [System.Collections.ArrayList]::new()
    $null = $arrayToReturn.Add('logICM_20210222_075234315.txt')
    return $arrayToReturn
 }

In some cases this can be marginally faster. However, I prefer the [Void] syntax and haven't observed whatever minor performance differential there may be.

Note: $null = ... works in all cases, while there are some cases where [Void] will not; See this answer (thanks again mklement0) for more information.

An aside, you can use casting to establish the list:

$arrayToReturn = [System.Collections.ArrayList]@()

Update Incorporating Important Comments from @mklement0:

return $arrayToReturn may not behave as intended. PowerShell's output behavior is to enumerate (stream) arrays down the pipeline. In such cases a 1 element array will end up returning a scalar. A multi-element array will return a typical object array [Object[]], not [Collection.ArrayList] as seems to be the intention.

The comma operator can be used to guarantee the return type by making the ArrayList the first element of another array. See this answer for more information.

Example without ,:

Function Return-ArrayList { [Collections.ArrayList]@(1,2,3,4,5,6) }
$ArrReturn = Return-ArrayList
$ArrReturn.gettype().FullName

Returns: System.Object[]

Example with ,:

Function Return-ArrayList { , [Collections.ArrayList]@(1,2,3,4,5,6) }
$ArrReturn = Return-ArrayList
$ArrReturn.gettype().FullName

Returns: System.Collections.ArrayList

Of course, this can also be handled by the calling code. Most commonly by wrapping the call in an array subexpression @(...). a call like: $filenames = @(returnAnArray) will force $filenames to be a typical object array ([Object[]]). Casting like $filenames = [Collections.ArrayList]@(returnArray) will make it an ArrayList.

For the latter approach, I always question if it's really needed. The typical use case for an ArrayList is to work around poor performance associated with using += to increment arrays. Often this can be accomplished by allowing PowerShell to return the array for you (see below). But, even if you're forced to use it inside the function, it doesn't mean you need it elsewhere in the code.

For Example:

$array = 1..10 | ForEach-Object{ $_ }

Is preferred over:

$array = [Collections.ArrayList]@()
1..10 | ForEach-Object{ [Void]$array.Add( $_ ) }

Persisting the ArrayList type beyond the function and through to the caller should be based on a persistent need. For example, if there's a need easily add/remove elements further along in the program.

Still More Information:

Notice the Return statement isn't needed either. This very much ties back to why you were getting extra output. Anything a function outputs is returned to the caller. Return isn't explicitly needed for this case. More commonly, Return can be used to exit a function at desired points...

A function like:

Function Demo-Return {
    1
    return 
    2
}

This will return 1 but not 2 because Return exited the function beforehand. However, if the function were:

Function Demo-Return
{
    1
    return 2
}

This returns 1, 2.

However, that's equivalent to Return 1,2 OR just 1,2 without Return

Update based on comments from @zett42:

You could avoid the ArrayList behavior altogether by using a different collection type. Most commonly a generic list, [Collections.Generic.List[object]]. Technically [ArrayList] is deprecated already making generic lists a better option. Furthermore, the .Add() method doesn't output anything, thus you do not need [Void] or any other nullification method. Generic lists are slightly faster than ArrayLists, and saving the nullification operation a further, albeit still small performance advantage.

Steven
  • 6,817
  • 1
  • 14
  • 14
-1

ArrayList appears to store alternating indexes and values:

PS /home/alistair> $filenames[0]
0
PS /home/alistair> $filenames[1]
logICM_20210222_075234315.txt
Alistair Wall
  • 332
  • 1
  • 2
  • 3
  • 1
    The ArrayList isn't storing the indexes, the `$FileNames` variable is. PowerShell will coalesce function output it's the whole output of the function getting stored in `$FileNames` not just the ArrayList. – Steven Mar 04 '21 at 13:27
  • `$fileNames` isn't an _array list_ anymore, because the function outputs _its elements_, which, if there's more than _one_ element, PowerShell collects in an _array_ (`[object[]]`) on assigning to a variable. The index comes from the function's output stream accidentally getting "polluted" by the return value from the `.Add()` method call. – mklement0 Mar 05 '21 at 18:07