2

I'm working on a Powershell module that needs to call a .NET method which accepts a string list parameter. The method constructs a REST URL based on the string collection to retrieve the results for all elements in the list in a single API call:

var MyAPIConnector = new APIConnector();
var elements = File.ReadAllLines(@"C:\temp\myfile.txt").ToList();
List<ApiResult> Result = MyAPIConnector.GetResult(elements);

This code executes within a couple of seconds.

In my Powershell module, the same operation takes a lot longer. Here's how the module is coded:

[Cmdlet(VerbsCommon.Get, "ApiResult")]
[OutputType(typeof(ApiResult))]
public class GetApiResults : Cmdlet
{
    [Parameter(Mandatory = true, ValueFromPipeline = true)]
    public string[] Identity {get; set;}

    private List<string> Input;
    private List<ApiResult> Result;

    protected override void BeginProcessing()
    {
        base.BeginProcessing();
    }

    protected override void ProcessRecord()
    {
        base.ProcessRecord();
        Result = MyAPIConnector.GetResult(Identity.ToList());
        WriteObject(Result);
    }
}

If I use my Cmdlet in the same maner as my sample program:

$Result = Get-Content -Path 'C:\temp\myfile.txt' | Get-ApiResult

The same result is returned, but it is much, much slower. Through debugging, I was able to determine that although, in my Cmdlet source code, I am using Linq to convert the string array Identity to a List, MyAPIConnector.GetResult() is being called separately for each element of the array! Can anyone explain why that is?

Mike Bruno
  • 600
  • 2
  • 9
  • 26
  • 3
    This is by design, ProcessRecord() is called once per input object. You should use ProcessRecord() to accumulate the input, then call your method in EndProcessing(), passing the input you've accumulated – Mathias R. Jessen Feb 09 '21 at 02:09
  • PowerShell is optimized for streaming, which results (if correctly implemented) in a low memory usage, and could potentially even result in a better performance e.g. where each cmdlet in the pipeline processes the information simultaneously.(e.g. process the rest call while you retrieving the next one), see also [Advocating native PowerShell](https://stackoverflow.com/a/58357033/1701026) – iRon Feb 09 '21 at 10:30

2 Answers2

1

UPDATE: Correction - I mistakenly wrote this up suggesting that objects are buffered coming in when using -OutBuffer which makes no sense. Only 1 inbound object can be processed at a time coming from the pipeline. You can however use the OutBuffer parameter to control how many objects are sent down the pipeline from a cmdlet as described below. OutBuffer does not help you in any way with your problem though :) As suggested in the final paragraph below, you may call your cmdlet and provide your string array directly as an argument to -Identity which will allow you to utilize Linq in the way that you want.

Now a little about OutBuffer :) There is a common parameter called -OutBuffer that will take an integer to define the number of objects it holds on to before it outputs them. Still, each object will be processed individually when received. The result will be buffered until the given OutBuffer + 1 is met or there are no more objects.

Consider the following examples

I will use the following test function to demonstrate

Function Test-PipelineInput
{
    [CmdletBinding()]
    param(
        [Parameter(ValueFromPipeline = $true)]
        [string[]]$Identity
    )

    Process {
        if ($Identity){
            foreach ($item in $Identity) {
                $item
            }
            Write-Host "---End process block"
        }
    }
}

Without -OutBuffer

$Gettysburg | Select -First 9 | Test-PipelineInput 

Four score and seven years ago our 
---End process block
fathers brought forth on this continent,
---End process block
a new nation, conceived in Liberty, and
---End process block
dedicated to the proposition that all 
---End process block
men are created equal.
---End process block
Now we are engaged in a great 
---End process block
civil war, testing whether that
---End process block
nation, or any nation so conceived
---End process block
and so dedicated, can long endure. 
---End process block

With -Outbuffer 2 (buffers 2 and processes on 3rd)

$Gettysburg | Select -First 9 | Test-PipelineInput -OutBuffer 2

---End process block
---End process block
Four score and seven years ago our
fathers brought forth on this continent, 
a new nation, conceived in Liberty, and
---End process block
---End process block
---End process block
dedicated to the proposition that all 
men are created equal.
Now we are engaged in a great
---End process block
---End process block
---End process block
civil war, testing whether that 
nation, or any nation so conceived
and so dedicated, can long endure.
---End process block

So you can see here how process block will not receive any object for first 2 received and then process on the 3rd and so on. UPDATE: Correction - Process block will run on each individual object when they are received and hold the output objects until buffer + 1 objects are queued. This still will not give you what you want though. The option below, not using the pipeline, is the way to go if you want the whole array sent into the function at once to be processed using Linq

Additionally instead of sending your object through the pipeline you may also just hand it over as an argument to your -Identity parameter where your function will receive the full string array and process it fully

Test-PipelineInput -Identity ( $Gettysburg | Select -First 15 )

Four score and seven years ago our 
fathers brought forth on this continent, 
a new nation, conceived in Liberty, and 
dedicated to the proposition that all 
men are created equal.
Now we are engaged in a great
civil war, testing whether that 
nation, or any nation so conceived
and so dedicated, can long endure.
We are met on a great battle-field
of that war. We have come to dedicate
a portion of that field, as a final 
resting place for those who here gave
their lives that that nation might
live. It is altogether fitting and 
---End process block

ONE LAST UPDATE: Adding the unary array operator (comma in front of array) will allow an array to be piped to next command. I originally thought this wasn't working but it was something else.

,($Gettysburg | Select-Object -First 15) | Test-PipelineInput

Four score and seven years ago our
fathers brought forth on this continent,
a new nation, conceived in Liberty, and
dedicated to the proposition that all
men are created equal.
Now we are engaged in a great
civil war, testing whether that
nation, or any nation so conceived
and so dedicated, can long endure.
We are met on a great battle-field
of that war. We have come to dedicate
a portion of that field, as a final
resting place for those who here gave
their lives that that nation might
live. It is altogether fitting and
---End process block

See this post for more on that Pipe complete array-objects instead of array items one at a time?

Daniel
  • 4,792
  • 2
  • 7
  • 20
1

Making the RESTful API call in ProcessRecord() resulted in the API call being made individually for each element of string[] Identity.

The solution, for my specific use case, was to make the RESTful API call in EndProcessing rather than ProcessRecord. In ProcessRecord(), I needed to iterate through the elements of string[] Identity and load them into a string list.

using MoreLinq;

[Cmdlet(VerbsCommon.Get, "ApiResult")]
[OutputType(typeof(ApiResult))]
public class GetApiResults : Cmdlet
{
    [Parameter(Mandatory = true, ValueFromPipeline = true)]
    public string[] Identity {get; set;}

    private List<string> Input;
    private List<ApiResult> Result;

    //Don't need to override BeginProcessing 

    protected override void ProcessRecord()
    {
        base.ProcessRecord();
        
        //Iterate through the entries in Identity and load them into the Input list
        Identity.ForEach(p => Input.Add(p));
    }
    
    protected override void EndProcessing()
    {
        //Once we're in the EndProcessing context, List<string> Input has all elements from string[] Identity
        Result = MyAPIConnector.GetResult(Input);
        WriteObject(Result);
        base.EndProcessing();
    }
}

Essentially, I am circumventing a feature of native PowerShell which saves on memory by "streaming" the elements of Identity one at a time. This made sense for my particular use case since the RESTful API I'm using performs much better when accepting multiple input items in a single call.

Credit goes to Mathias R. Jessen for the clue. Thanks!

Mike Bruno
  • 600
  • 2
  • 9
  • 26