2

Edit: Adding clarification this my desire is only to show a progress bar with an known end (e.g. end of the pipeline) so that the function can provide a percentage toward completion (usually for large sets in the hundreds or thousands). I occasionally write functions to take objects from both pipeline or via a parameter so that the function can be flexible.

A Progress bar for an array of objects coming in from a parameter is simple enough, and is equivalent to pulling in the full pipeline set first, then processing them again. I have been avoiding the latter and I simply forego the write-progress in this scenario as it's not worth the impact.

I don't recall where I saw it, but someone mentioned $PSCmdlet.MyInvocation might provide the count, but perhaps I interpreted that incorrectly.


I've written some functions that take pipeline input and often would like to write a percentage progress bar for all the objects coming into the function via the pipeline.

Is there a way to get the total count at the beginning of the function?

I'm aware of how increase a counter as the function loops through the pipeline objects, but this only gets me the number of objects processed so far. I'd like to get a percentage of this by calculating it against the full pipeline count.

I've looked at the $MyInvocation and $PSCmdlet.MyInvocation properties, but the PipelineLength and PipelinePosition values area always '2' no matter how big the pipeline set is.

Note: this isn't what I'm focusing on as a solution, it's just one of the things I found that looked promising

Here's a test:

Function Test-Pipeline {
[CmdletBinding()]

PARAM(
    [Parameter(ValueFromPipeLine=$true,ValueFromPipelineByPropertyName=$true)]
    [Alias("FullName")]
    [psobject[]]$Path
)

BEGIN{
    $PSCmdlet.MyInvocation | Select *
}
PROCESS{
    ForEach ($xpath in $Path) {
        $filepath = Resolve-Path $xpath | Select -ExpandProperty Path
    }
}
END{}
}

When I pipe in the contents of dir (which this folder contains 20 items), I get:

PS C:\Temp> Dir | Test-Pipeline

MyCommand             : Test-Pipeline
BoundParameters       : {}
UnboundArguments      : {}
ScriptLineNumber      : 1
OffsetInLine          : 7
HistoryId             : 213
ScriptName            : 
Line                  : Dir | Test-Pipeline
PositionMessage       : At line:1 char:7
                        + Dir | Test-Pipeline
                        +       ~~~~~~~~~~~~~
PSScriptRoot          : 
PSCommandPath         : 
InvocationName        : Test-Pipeline
PipelineLength        : 2
PipelinePosition      : 2
ExpectingInput        : True
CommandOrigin         : Runspace
DisplayScriptPosition : 
BHall
  • 308
  • 1
  • 11
  • 1
    For a proper written pipeline, the answer is **no**, simply because the objects you are going to receive do not yet exist. Saying that, you might think of assigning all the objects in a variable (and use the `count` property) but that will actually choke your pipeline and load everything in memory which will throw away the advantages of the pipeline. – iRon May 14 '21 at 05:42

4 Answers4

2

As already commented, if you want to know the amount of objects you will receive in your cmdlet to predefine the units of a progress bar you basically can't (without breaking the streaming process).
Any "solution" that appears to do what you want, actually chokes the whole pipeline by collecting all the objects (in memory), counting them and finally releasing them all at once. This will violate the Strongly Encouraged Development Guidelines to Write Single Records to the Pipeline and result in a high memory usage.

Explanation

Visualize a assembly line, you are responsible of coloring all the objects that pass at your station. Now you want to know how many object you need to do today. Unless somebody outside your station (cmdlet) tells you how many will follow, you can only determine that by just receiving all the objects counting them, (still coloring them) and passing them on. As all these actions will take time (and storageroom for the objects), the person at the next station won't be happy as he will receive all the objects in once and a lot later than expected...

Technically

Let's build a cmdlet which generates colorless objects from a ProcessList:

function Create-Object {
    [CmdletBinding()]
    param(
        [Parameter(ValueFromPipeLine=$true)]$ProcessList,
        [Switch]$Show
    )
    begin {
        $Index = 0
    }
    process {
        $Object = [PSCustomObject]@{
            Index = $Index++
            Item  = $ProcessList
            Color = 'Colorless'
        }
        if ($Show) { Write-Host 'Created:' $Object.PSObject.Properties.Value }
        $Object
    }
}

And this is you, coloring the objects:

function Color-Object {
    [CmdletBinding()]
    param(
        [Parameter(ValueFromPipeLine=$true)]$InputObject,
        [Switch]$Show
    )
    process {
        $InputObject.Color = [ConsoleColor](Get-Random 16)
        if ($Show) { Write-Host 'Colored:' $InputObject.PSObject.Properties.Value }
        $InputObject
    }
}

This will be the result:

'a'..'e' |Create-Object |Color-Object

Index Item       Color
----- ----       -----
    0    a DarkMagenta
    1    b      Yellow
    2    c        Blue
    3    d        Gray
    4    e       Green

Now let's see, how things are actually processed:

'a'..'e' |Create-Object -Show |Color-Object -Show

Created: 0 a Colorless
Colored: 0 a DarkGreen

Created: 1 b Colorless
Colored: 1 b DarkRed
Created: 2 c Colorless
Colored: 2 c Gray
Created: 3 d Colorless
Colored: 3 d DarkGreen
Created: 4 e Colorless
Colored: 4 e DarkGray
Index Item     Color
----- ----     -----
    0    a DarkGreen
    1    b   DarkRed
    2    c      Gray
    3    d DarkGreen
    4    e  DarkGray

As you see, the first item "a" (index 0) is colored before the second item "b" (index 1) is created!
In other words, the Create-Object hasn't created all the objects yet and there is no way to know how many will follow. Except from just waiting for them which you do not want to do as explained before and certainly want to avoid if there are a lot of objects, the objects are fat (and a PowerShell object is usually fat) or in case of a slow input (see also: Advocating native PowerShell). This implies that you might want to make an exception to this if enumerating (counting) the objects is trivial to the rest of the process: e.g. collect file info (Get-ChildItem) to later invoke a heavy process (as e.g. Get-FileHash).

iRon
  • 20,463
  • 10
  • 53
  • 79
  • 1
    Thanks for the great explanation, and it confirmed the gut feeling I had. The progress bar is a nice to have, and I always forego it when writing functions that take pipeline input with more than 1 object as the status bar is just a nice to have, not worth the expense of grabbing the entire pipeline set to count it. I guess I'll continue to use an increasing counter instead (which doesn't explicitly indicate how much is left in the pipeline). – BHall May 15 '21 at 00:24
2

If you write the function without a Begin, Process and End block, you can use the $inputAutomatic variable to figure out how many items are sent through the pipeline.

$input contains an enumerator that enumerates all input that is passed to a function. The $input variable is available only to functions and script blocks (which are unnamed functions).

function Test-Pipeline {
    [CmdletBinding()]
    param(
        [Parameter(ValueFromPipeLine=$true,ValueFromPipelineByPropertyName=$true)]
        [Alias("FullName")]
        [psobject[]]$Path
    )
    $count = @($input).Count
    Write-Host "$count items are sent through the pipeline"

    # changed variable '$xpath' into '$item' to avoid confusion with XML navigation using XPath
    foreach ($item in $Path) {
        # process each item
        # $filepath = Resolve-Path $item | Select -ExpandProperty Path
    }
}

Get-ChildItem -Path 'D:\Downloads' | Test-Pipeline

Output something like

158 items are sent through the pipeline
Theo
  • 57,719
  • 8
  • 24
  • 41
1

You can't do that inside the BEGIN. The only way would be add the elements from the pipe in a list THEN run you code at the end.

Here's a way to do just that:

Function Test-Pipeline {
    [CmdletBinding()]
    
    PARAM(
        [Parameter(ValueFromPipeLine = $true, ValueFromPipelineByPropertyName = $true)]
        [Alias("FullName")]
        [psobject[]]$Path
    )
    
    BEGIN {
        $i = 0
        $list = @()
    }
    PROCESS {
        ForEach ($xpath in $Path) {
            $list += $xpath
            $i++
        }
    }
    END {
        $i
        if ($i -gt 0) {
            $j = 1
            ForEach ($item in $list) {
                Write-Output $item
                Write-Progress -Activity 'activity' -Status $item.ToString() -PercentComplete (($j++)*100/$i)
                Start-Sleep -Seconds 2
            }
        }
    }
}
PollusB
  • 1,726
  • 2
  • 22
  • 31
  • 1
    Apart from the issue that this will choke (and restart) the pipeline, you [should generally avoid using the increase assignment operator (`+=`) to create a collection](https://stackoverflow.com/a/60708579/1701026) as it is very expensive. – iRon May 14 '21 at 12:35
  • Do you have an example of "choke and restart"? And I don't see where it's requested to not be expensive? If it's for 10 files, it's fine. For 100000000, It would advise differently. – PollusB May 14 '21 at 15:41
  • see my [answer](https://stackoverflow.com/a/67534378/1701026), in other words, check the order of the input and output objects (with e.g. `write-host`), you might also check the memory usage of your function to confirm that you hoard the objects. – iRon May 14 '21 at 15:48
  • I appreciate the suggestion, PollusB. I generally avoid pulling in the full pipeline and then processing it again since I can sometimes just include logic to process an array of objects fed in through a parameter – BHall May 15 '21 at 00:51
1

If you need the count, you can use the Tee-Object Cmdlet to create a new var.

dir | tee-object -Variable toto | Test-Pipeline

Then

Function Test-Pipeline {
[CmdletBinding()]

PARAM(
    [Parameter(ValueFromPipeLine=$true,ValueFromPipelineByPropertyName=$true)]
    [Alias("FullName")]
    [psobject[]]$Path
)

BEGIN{
    $PSCmdlet.MyInvocation | Select *
    $toto.count

}
PROCESS{
    ForEach ($xpath in $Path) {
        $filepath = Resolve-Path $xpath | Select -ExpandProperty Path
    }
}
END{}
}

Gives the following for me :

MyCommand             : Test-Pipeline
BoundParameters       : {}
UnboundArguments      : {}
ScriptLineNumber      : 1
OffsetInLine          : 35
HistoryId             : 11
ScriptName            : 
Line                  : dir | tee-object -Variable toto | Test-Pipeline
PositionMessage       : Au caractère Ligne:1 : 35
                        + dir | tee-object -Variable toto | Test-Pipeline
                        +                                   ~~~~~~~~~~~~~
PSScriptRoot          : 
PSCommandPath         : 
InvocationName        : Test-Pipeline
PipelineLength        : 3
PipelinePosition      : 3
ExpectingInput        : True
CommandOrigin         : Runspace
DisplayScriptPosition : 

73

And

dir | Measure-Object


Count    : 73
Average  : 
Sum      : 
Maximum  : 
Minimum  : 
Property : 
JPBlanc
  • 70,406
  • 17
  • 130
  • 175
  • Oh, this is kind of interesting! making the count a global variable. I hadn't really thought of that for this scenario. – BHall May 15 '21 at 01:09