3

The PowerShell Strongly Encouraged Development Guidelines that cmdlets should Implement for the Middle of a Pipeline but I suspect that isn't doable for a parameter as -Last for the Select-Object. Simply because you can't determine the last entry upfront. In other words: you will need to wait for the input stream to finish until you define the last entry.
To prove this, I wrote a little script:

$Data = 1..5 | ForEach-Object {[pscustomobject]@{Index = "$_"}}

$Data | ForEach-Object { Write-Host 'Before' $_.Index; $_ } |
Select-Object -Last 5 | ForEach-Object { Write-Host 'After' $_.Index }

and compared this to Select-Object *:

$Data | ForEach-Object { Write-Host 'Before' $_.Index; $_ } |
Select-Object * | ForEach-Object { Write-Host 'After' $_.Index }

With results (right: Select-Object -Last 5, left: Select-Object *):

-Last 5  *
-------  -
Before 1 Before 1
Before 2 After 1
Before 3 Before 2
Before 4 After 2
Before 5 Before 3
After 1  After 3
After 2  Before 4
After 3  After 4
After 4  Before 5
After 5  After 5

Despite this isn't documented I think that I can conclude from this that the -Last parameter indeed chokes the pipeline.
This is not a big deal, but I also tested it against the -First parameter and got some disturbing results. To better show this, I am not selecting all the objects but just the **-First 2**:

$Data | ForEach-Object { Write-Host 'Before' $_.Index; $_ } |
Select-Object -First 2 | ForEach-Object { Write-Host 'After' $_.Index }

Before 1
After 1
Before 2
After 2

Note that with the -First 2 parameter not only the following cmdlet shows two objects but also the preceding cmdlet (ForEach-Object { Write-Host 'Before' $_.Index; $_ }) shows only 2 objects (instead of 5).

Apparently, the -First parameter references directly into the object of the prior cmdlet which is different then e.g. using the -Last 2 parameter:

$Data | ForEach-Object { Write-Host 'Before' $_.Index; $_ } |
Select-Object -Last 2 | ForEach-Object { Write-Host 'After' $_.Index }

Before 1
Before 2
Before 3
Before 4
Before 5
After 4
After 5

This also happens when using the Out-Host instead of the Write-Host cmdlet or sending the results to a variable, like:

$Before = ""; $After = ""
$Data | ForEach-Object { $Before += $_.Index; $_ } | Select-Object -First 2 | ForEach-Object { $After += $_.Index }
$Before
$After

This shows on both Windows Powershell (5.1.18362.628) and PowerShell Core (7.0.0).
Is this a bug?

iRon
  • 20,463
  • 10
  • 53
  • 79
  • 1
    `Sort-Object`, `Group-Object`, and `Select-Object` [with one of the "need them all" parameters] _require the previous that the pipeline stage send "the are no more items" before they will do the required thing. it's logical ... and i thot it was documented, but i cannot find such at this time. – Lee_Dailey Mar 21 '20 at 17:05
  • "Is this a bug?" - is _what_ a bug? You've described some observed behavior, and you seem perfectly capable of reasoning about _why_, but I don't see anything... broken? – Mathias R. Jessen Mar 21 '20 at 17:28
  • @Mathias, sorry that wasn't very clear, but the issue with '-First 2` is that **the preceding cmdlet also shows 2 objects** and not the initial 5 (I have added this to the question). – iRon Mar 21 '20 at 17:49
  • @iRon Ahh, gotcha! There's a good explanation for that! – Mathias R. Jessen Mar 21 '20 at 17:49
  • Select-object doesn't block, but format-table in the background blocks. I like the feature that select-object -first kills the pipeline. It makes sense that select-object -last would block. – js2010 Mar 21 '20 at 17:50

1 Answers1

9

Select-Object affects the upstream commands by cheating

That might sound like a joke, but it's not.

To optimize pipeline streaming performance, Select-Object uses a trick not available to a regular user developing a Cmdlet - it throws a StopUpstreamCommandsException.

Once caught, the runtime (indirectly) calls StopProcessing() on all the preceding commands, but does not treat it as a terminating error event, allowing the downstream cmdlets to continue executing.

This is extremely useful when you have slow or computationally heavy command early in a pipeline:

# this will only take ~3 seconds to return with the StopUpstreamCommand behavior
# but would have incurred 8 extra seconds of "waiting to discard" otherwise
Measure-Command {
  1..5 |ForEach-Object { Start-Sleep -Seconds 1; $_ } |Select-Object -First 3
}
Mathias R. Jessen
  • 157,619
  • 12
  • 148
  • 206
  • 1
    And if for some reason you would like it to still process all of the pipeline objects even though it won't output some of them, you can add the `-Wait` parameter to `Select-Object`. This also obviously prevents the `StopUpstreamCommandsException` from being thrown. – Shenk May 06 '21 at 23:15