4

Given a properly defined variable

$test = New-Object System.Collections.ArrayList

.Add pollutes the pipeline with the count of items in the array, while .AddRange does not. $test.Add('Single') will dump the count to the console. $test.AddRange(@('Single2')) will be clean with no extra effort. Why the different behavior? Is it just an oversight, or is there some intentional behavior I am not understanding?

Given that .AddRange requires coercing to an array when not using a variable (that is already an array) I am tending towards using [void]$variable.Add('String') when I know I need to only add one item, and [void]$test.AddRange($variable) when I am adding an array to an array, even when $variable only contains, or could only contain, a single item. The [void] here isn't required, but I wonder if it's just best practice to have it, depending of course on the answer above. Or am I missing something there too?

Gordon
  • 6,257
  • 6
  • 36
  • 89
  • Does this answer your question? [Return Multidimensional Array From Function](https://stackoverflow.com/questions/40220590/return-multidimensional-array-from-function): `$test.Add(,@('Single'))` – iRon May 02 '20 at 07:48
  • "*Add pollutes the **pipeline***", is not entirely correct, you put the items on the pipeline when you output (default: [`Write-Output`](https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.utility/write-output)) the object (`$Test`). – iRon May 02 '20 at 07:56
  • I guess my question relates to WHY the difference? Why does `.Add` dump the count to the pipeline and `.AddRange` does not? If I want to know how big the resultant array is after adding one item, wouldn't I also want to know (even more so) after adding N items? And really, when would I ever want to know as pipeline? I can always just get the `.Count`. As for how that relates to the unroll behavior, I don't see the connection, but it's also morning for me and I am still finishing first coffee, so I may look at this again in a bit and see if something starts to make sense. – Gordon May 02 '20 at 08:01
  • And, regarding the pollution, I am never actually outputting `$test`. I expect the contents of `$test` when I do so. But `$variable.Add('String')` is not output, it's simply modifying, and getting back `1` at the console seems like pollution to me. – Gordon May 02 '20 at 08:03
  • And, `$test.Add(,@('Single'))` is actually throwing an error for me. The comma is not only not needed, it's not allowed, at least in PS5 on my Win10VM. Haven't tried it on PS2/Win7. – Gordon May 02 '20 at 08:05
  • `$variable.Add('String')` doesn't have anything to do with the PowerShell pipeline. .Net objects like `$test = New-Object System.Collections.ArrayList` intend to load everything into memory rather than sent the items into the pipeline using the pipe ("`|`") operator. – iRon May 02 '20 at 08:21
  • OK, now I am confused, because for me `$variable.Add('String')` prints the current size of the array to the console, which is pipeline behavior as I understand it. `.AddRange()` does not, and it's that difference that I am wondering about. As I understand it one is returning the count, the other isn't, and when you specifically don't handle the return value by assigning it to a variable or sending it to void, then it ends up in the pipeline, which is what I am calling pollution. `$count = $test1.Add('one')` seems to corroborate that, as `$count` will end up containing the return value. – Gordon May 02 '20 at 08:30
  • It does occur to me that what is being returned is NOT the count, since it starts at 0. It's the INDEX of the item that was just added. And I can see where that might be useful to keep track of. For example if I am adding a header log item before an indeterminate number of regular log items, I could hold on to the header's index, send the rest to $null, then when I am done update that header directly because I know the index. Seems a bit edge case to me, but at least it makes some sense. – Gordon May 02 '20 at 09:08
  • 1
    this entire discussion can be avoided since the `arralylist` type is deprecated - and has been for several years. [*grin*] so ... don't use it. instead, use something like the `generic.list` type. that does not emit the index of the added item. plus, it is a tiny, tiny bit faster. – Lee_Dailey May 02 '20 at 13:41
  • @Lee_Dailey Really? I feel like it was not that long ago that I was running into some other issues where ArrayList was promoted as the solution. But I guess I'll look into implications of moving too Generic.List now. :) – Gordon May 02 '20 at 15:25
  • @Gordon - yep! it surprised me, too. [*grin*] apparently MS has long deprecated the arraylist type in favor of the generic list type. it's unfortunate that so many examples still use the arraylist ... but it DOES still work. [*grin*] – Lee_Dailey May 02 '20 at 16:13

2 Answers2

3

Why the different behavior? Is it just an oversight, or is there some intentional behavior I am not understanding?

Because many years ago, someone decided that's how ArrayList should behave!

Add() returns the index at which the argument was inserted into the list, which may indeed be useful and makes sense.

With AddRange() on the other hand, it's not immediately clear why it should return anything, and if yes, what? The index of the first item in the input arguments? The last? Or should it return a variable-sized array with all the insert indices? That would be awkward! So whoever implemented ArrayList decided not to return anything at all.

In C# or VB.NET, for which ArrayList was initially designed, "polluting the pipeline" doesn't really exist as a concept, the runtime would simply omit copying the return value back to the caller if someone invokes .Add() without assigning to a variable.

The [void] here isn't required, but I wonder if it's just best practice to have it, depending of course on the answer above. Or am I missing something there too?

No, it's completely unnecessary. AddRange() is not magically one day gonna change to output anything.


If you don't ever need to know the insert index, use a [System.Collections.Generic.List[psobject]] instead:

$list = [System.Collections.Generic.List[psobject]]::new()

# this won't return anything, no need for `[void]`
$list.Add(123)

If for some reason you must use an ArrayList, you can "silence" it by overriding the Add() method:

function New-SilentArrayList {
  # Create a new ArrayList
  $newList = [System.Collections.ArrayList]::new()

  # Create a new `Add()` method, then return the list
  $newAdd  = @{
    InputObject = $newList
    MemberType = 'ScriptMethod' 
    Name = 'Add'
    Value = {param($obj) $this.AddRange(@($obj))}
  }
  Write-Output $( 
    Add-Member @newAdd -Force -PassThru
  ) -NoEnumerate
}

Now your ArrayList's Add() will never make a peep again!

PS C:\> $list = New-SilentArrayList
PS C:\> $list.Add(123)
PS C:\> $list
123
Mathias R. Jessen
  • 157,619
  • 12
  • 148
  • 206
1

Apparently I didn't quiet understand where you where heading to.
"Add pollutes the pipeline", at a second thought is a correct statement but .Net methods like $variable.Add('String') do not use the PowerShell pipeline by itself (until the moment you output the array using the Write-Output command which is the default command if you do not assign it to a variable).

The Write-Output cmdlet is typically used in scripts to display strings and other objects on the console. However, because the default behavior is to display the objects at the end of a pipeline, it is generally not necessary to use the cmdlet.

The point is that Add method of ArrayList returns a [Int32] "The ArrayList index at which the value has been added" and the AddRange doesn't return anything. Meaning if you don't assign the results to something else (which includes $Null = $test.Add('Single')) it will indeed be output to the PowerShell Pipeline.
Instead you might also consider to use the Add method of the List class which also doesn't return anything, see also: ArrayList vs List<> in C#.
But in general, I recommend to use native PowerShell commands that do use the Pipeline
(I can't give you a good example as it is not clear what output you expect but I noticed another question you removed and from that question, I presume that this Why should I avoid using the increase assignment operator (+=) to create a collection answer might help you further)

iRon
  • 20,463
  • 10
  • 53
  • 79
  • I guess what I am still trying to understand is, WHY does `.Add()` return an `[int32]` and `.AddRange()` returns nothing? What is the underlying logic for different return behaviors in such similar methods? As for the pipeline, I find it is VERY useful for one liners and simple scripts, but it can be a real pain to debug when you are dealing with large programs, and what does or doesn't end up in the pipeline is so inconsistent. It's one reason I am refactoring everything to classes now, to avoid the pipeline entirely and get a more deterministic behavior out of PowerShell. – Gordon May 02 '20 at 11:12
  • Regarding that other question, I messed up my test code. I was trying to repeat to measure with `foreach ($I in 1.1000) {` rather than `foreach ($I in 1..1000) {` and it looked like `+=` was faster. Once I revised the code I saw what I expected to see, += is slower, and the more iterations the slower it got. – Gordon May 02 '20 at 11:14