4

The following code prints the counts if there are more than two items. The .Split(',') was called twice.

'a,b,c', 'x,y', '1,2,3' |
Where-Object { $_.Split(',').Count -gt 2 } |
ForEach-Object { $x = $_.Split(','); $x.Count }

The following code try to call .Split(',') once. It doesn't get any output.

'a,b,c', 'x,y', '1,2,3' |
ForEach-Object { @($_.Split(',')) } | # got a single list instead of `list of list of string`
Where-Object { $_.Count -gt 2 } |
ForEach-Object { $_.Count }

However, ForEach-Object flattens the list of list to list. Is it a way to prevent the flattening?

ca9163d9
  • 27,283
  • 64
  • 210
  • 413

1 Answers1

4

You can take advantage of the fact that both Where-Object and ForEach-Object run the script blocks passed to them ({ ... }) in the same scope, the caller's scope:

'a,b,c', 'x,y', '1,2,3', 'a,b,c,d' |
  Where-Object { ($count = $_.Split(',').Count) -gt 2 } |
    ForEach-Object { $count }

That is, the $count variable that is assigned to in the Where-Object script block is accessible in the ForEach-Object script block as well, input object by input object.

That said, you can do all you need with ForEach-Object alone:

'a,b,c', 'x,y', '1,2,3', 'a,b,c,d' |
  ForEach-Object { $count = ($_ -split ',').Count; if ($count -gt 2) { $count } }

Note that I've switched from the .Split() method to using PowerShell's more flexible -split operator.


As for what you tried:

Outputting an array (enumerable) to the pipeline causes its elements to be sent one by one rather than as a whole array - see this answer for background information.

The simplest way to avoid that, i.e, to send an array as a whole, is to wrap such an array in an auxiliary single-element wrapper array, using the unary form of ,, the array-construction operator: , $_.Split(',')

Note that enclosing a command in @(...) does not perform the same wrapping, because @(...) doesn't construct an array; loosely speaking, it merely ensures that the output is an array, so if the input already is an array - as in your case - @(...) is - loosely speaking - a (costly) no-op - see the bottom section of this answer for details.

mklement0
  • 382,024
  • 64
  • 607
  • 775
  • Re: _"Outputting an array ... causes its elements to be sent one by one rather than as a whole array."_ It's probably useful to clarify: it's not quite outputting just an _'array'_, but outputting **arrays whose some items are arrays**, and the _'sending of items one by one'_ applies to the items - items that are arrays get 'flattened' onto the outer array. The flattening happens to `ForEach-Object { @($_.Split(',')) }` because it is such a 'array whose items are arrays'. [An SO answer here](https://stackoverflow.com/a/61199149/4356868) may provide some more explanation of this behavior. – Slawomir Brzezinski Mar 27 '22 at 10:55
  • @SlawomirBrzezinski, there are no nested arrays in play in this case, assuming that's what you mean, and discussing the behavior in terms of _flattening of arrays_ rather than _enumeration_ is ultimately confusing. In each `ForEach-Object` iteration, `$_.Split(',')` returns a flat array (wrapping it in `@(...)` is just a costly no-op). On output to the pipeline, the array gets _enumerated_, so that each iteration outputs _individual strings_, which, when _captured in a variable_ becomes a regular (in this case flat) PowerShell array, of type `[object[]]`. – mklement0 Mar 27 '22 at 15:58
  • I see that, with my suggestion for different description, I was not mindful of the context, which is piping (I came here from discussion of cmdlets' results without pipes). I pointed that `ForEach-Object` cmdlet produces enumerable, but everything in piping does - enums wrap everything (even `$scalar | ...`), so you are correct to call items the 'output'. Sorry! Re: _"discussing the behavior in terms of ... arrays rather than enumeration is ultimately confusing"_ I used 'arrays' only in the way you used the term as synonym (in your _"array (enumerable)"_), else my comment would be too long ;P – Slawomir Brzezinski Mar 28 '22 at 00:07
  • As said, your description is indeed enough for pipes, because they add the outer enumerable implicitly, but just to demonstrate the interesting fact about this behavior outside of pipes context, execute this one-liner: `function GetSingleResultAsArray { return ,"1" }; "$($(GetSingleResultAsArray).GetType())"`. It outputs 'string', so the array (forced with `,"1"`) got flattenned. My [SO answer I mentioned](https://stackoverflow.com/a/61199149/4356868) may be useful if interested further. – Slawomir Brzezinski Mar 28 '22 at 00:29
  • @SlawomirBrzezinski, pipelines also apply without explicit use of `|`, the pipeline _operator_: any script or function outputs _to the pipeline_, so the same rules apply to your `GetSingleResultAsArray` function: The single-element array `, '"1"`, was _enumerated_ and capturing the output stream that contains that _single object_ therefore captured _that object itself_. Conceptualizing the PowerShell pipeline in terms of _arrays_ only leads to confusion. – mklement0 Mar 28 '22 at 00:41
  • This is understood (though I,for one,was once surprised). But are you still asserting that enumerables/array distiction is the core issue here? In my example, what was surprising/confusing was that some value-mangling pipeline is even involved in evaluation of inner part of expression trees.You posit what happened was normal, expected, but I doubt most programmers would say they expect this specific stripping of the array.In any case, it's the lack of awareness of what you describe 'always outputting to pipeline' that leads to the confusion, not some 'lack of conceptualizing with enumerables'. – Slawomir Brzezinski Mar 28 '22 at 02:03
  • @SlawomirBrzezinski, I'm not sure what you're trying to tell me, but I invite you to provide feedback on [this answer](https://stackoverflow.com/a/71641785/45375) instead, which explains the underlying concepts - which are undoubtedly surprising to those coming from other languages - in detail. – mklement0 Mar 28 '22 at 02:25