1

Is it possible to call a powershell function or script several times in parallel and wait for the data to return before moving on? I have done this in Linux and my pseudo code is a more like how it's done in linux as an example.

Say we have a function called 'Get-OSRebootDates' which returns the recent reboots as a result string. Instead of running it serially, can I run it in parallel?

Normal serial method

$Result = @()
ForEach ($pcname in $pclistarray) {
     $Result += Get-OSRebootDates -Target $pcname
     # call it one at a time and wait before moving on to the next  
} 
# then do something with data

Pseudo code of what I would like to do in parallel:

$Result = @()
$($Result += Get-OSRebootDates -Target $pcname1)  &
$($Result += Get-OSRebootDates -Target $pcname2)  &
$($Result += Get-OSRebootDates -Target $pcname3)  &
$($Result += Get-OSRebootDates -Target $pcname4)  &
Wait 
# then do something with the data

FYI example function

Function Get-OSRebootDates {
# Summary: Search Application log for recent reboots in hours (e.g. $RangeNegHours -> -340) (this search is fast)
# Params: Target is howt, Range is how far to look back 
Param(
    [Parameter(Mandatory = $true)]$Target, 
    [Parameter(Mandatory = $false)]$RangeNegHours = -8760 # default 1 year 
)
$filters = @{}
$filters.Add("StartTime", ((Get-Date).AddHours([int]$RangeNegHours)))
$filters.Add("EndTime", (Get-Date))
$filters.Add("LogName", "System")
$filters.Add("ID", 6005) #6005 started, #6006 shutdown
$filters.Add("providername", '*EventLog*')
$filters.Add("level", 4)
$RebootDates = [string]::Empty
Try { 
    $LogRaw = Get-WinEvent -ErrorAction Stop -FilterHashtable $filters -MaxEvents 2500000  -ComputerName $Target | select-object id, machinename, timecreated, message, LevelDisplayName, ProviderName
}
Catch {
    Write-Host ("Something went wrong with [" + $MyInvocation.MyCommand + "] for " + $Target + " ..") -ForegroundColor Red
    start-sleep -seconds 4
    return $RebootDates
}

$Count = 0
ForEach ($Item in $LogRaw) {
    If ([string]$Item.TimeCreated) {
        $RebootDates = $RebootDates + [string]$Item.TimeCreated + "`n"
        $Count += 1
        If ($Count -gt 5) {
            break;
        }
    }
}
Return [string]$RebootDates

}

UPDATE: Code Solution

Have a working demo of what I was looking for below. In this example we call a function against a list of servers and it performs the work as jobs. In this example we are calling a function called 'SampleFunction' which does little but return a string.

$servers = @('Server1','Server2','Server3')
$maxthreads = 4 # => Limit num of concurrent jobs

# Sample Function to test
function SampleFunction ([string]$Name) {
    Start-Sleep 30
    Return "Hello $Name from function .."
}
$funcDef = "function SampleFunction {$function:SampleFunction}"

Get-Job Thread* | Remove-Job | Out-Null

$jobs = foreach ($server in $servers) {
    
    $running = Get-Job -State Running
    #write-host("Running:"+$running.Count.ToString()) ;Get-Job Thread*
    if ($running.Count -ge $maxthreads) {
        $null = $running | Wait-Job
    }
    #Start-Sleep 1
    #Write-Host "Starting job for $server"
    $ThreadName = "Thread-$server"
    
    Start-Job -Name $ThreadName {
        . ([scriptblock]::Create($using:funcdef)) # => Load the function in this scope
        return SampleFunction -Name $using:server
    } # => Better to capture the Job instances /// | Out-Null
}

$result = $jobs | Receive-Job -Wait -AutoRemoveJob
Santiago Squarzon
  • 41,465
  • 5
  • 14
  • 37
Mike Q
  • 6,716
  • 5
  • 55
  • 62

1 Answers1

1

tl;dr regarding your Normal serial method, using += on a System.Array is highly inefficient, since arrays are collections of a fixed size, PowerShell has to recreate it on each iteration of your foreach loops. See this Q&A for a much better explanation.

This should be your go to Normal serial method, or you can use System.Collections.Generic.List<T> or System.Collections.ArrayList.

$Result = ForEach ($pcname in $pclistarray) {
     Get-OSRebootDates -Target $pcname
     # call it one at a time and wait before moving on to the next  
} 

Regarding your main question, How to run a function in parallel and wait for output, first of all, in my personal experience when working with Get-WinEvent I have seen much better (faster) results doing this:

Invoke-Command -ComputerName server01 -ScriptBlock { Get-WinEvent .... }

Than doing this:

Get-WinEvent .... -ComputerName server01

In addition, Invoke-Command can handle very well the multithreading for you since -ComputerName accepts a string[] (an array with hosts).

Invoke-Command [[-ComputerName] <String[]>]
Get-WinEvent [-ComputerName <String>]

Here goes the basic example of you can, first load the function in the Job's scope and then invoke it. Note, this is running 11 Jobs in parallel. If you're a looking for a more "efficient multithreading" since Start-Job is arguably slower than a normal linear loop consider using the TheadJob module or use Runspace.

function sayhello ([string]$Name, [int]$Job) {
    "Hello $Name from Job $Job..."
    Start-Sleep 10
}

$funcDef = "function sayhello {$function:sayhello}"

# Each Job should take 10 seconds
$elapsed = [System.Diagnostics.Stopwatch]::StartNew()

0..10 | ForEach-Object {
    Start-Job -ScriptBlock {
        param($def, $name, $job)

        . ([scriptblock]::Create($def)) # => Load the function in this scope
        sayhello -Name $name -Job $job

    } -ArgumentList $funcDef, 'world', $_
} | Receive-Job -AutoRemoveJob -Wait

$elapsed.Stop()
"`nElapsed Time: {0} seconds" -f
$elapsed.Elapsed.TotalSeconds.ToString('0.##')

Result:

Hello world from Job 1...
Hello world from Job 6...
Hello world from Job 0...
Hello world from Job 4...
Hello world from Job 3...
Hello world from Job 5...
Hello world from Job 2...
Hello world from Job 10...
Hello world from Job 7...
Hello world from Job 9...
Hello world from Job 8...

Elapsed Time: 13.82 seconds
Santiago Squarzon
  • 41,465
  • 5
  • 14
  • 37
  • 1
    Thanks for the feedback, I am aware of the "+=" issue but it's good to have it here in your comments; here it's not the issue, it's the time it takes to pull the data, I'm actually `getting something more time consuming, I'm pulling application logs from 4 servers for a specific time window. Also, sadly I can't using the "Invoke-Command" because the Windows Remote Management Service isn't running and hence when run will just lock the script. – Mike Q Dec 15 '21 at 14:44
  • 1
    hey also I will look at your code now and try it. thank you for your help. – Mike Q Dec 15 '21 at 14:44
  • 1
    @MikeQ happy to help, I feel the obligation to comment on `+=` for `system.array` whenever I see it on a question. It's good that you were already aware of it. If you have any doubt let me know but the concept to pass the function into the Job's scope should be the same. – Santiago Squarzon Dec 15 '21 at 14:47
  • 1
    I provided a sample code in my solution that seems to work. I call a bunch of jobs and clean them up based on their name Thread-Server1, Thread-Server2, etc.. – Mike Q Dec 16 '21 at 03:45
  • 1
    @MikeQ want me to update your code a bit ? – Santiago Squarzon Dec 16 '21 at 03:55
  • 1
    If you have any ideas that would be great!!! @Santiago Squarzon – Mike Q Dec 16 '21 at 13:17
  • 1
    @MikeQ done, try to avoid global scope whenever possible and if there is no other way, use the `$script:` scope. – Santiago Squarzon Dec 16 '21 at 13:24
  • 1
    I didn't know that, I thought I was making things "better" by using globals because people reading the code would see that it's not a passed in param when reading it. Thanks... I also saw the out-nul removed thanks. I also decided to label the jobs i started with a prefix so I can ensure they are removed. – Mike Q Dec 16 '21 at 14:50
  • 1
    @MikeQ it's not that using `global:` is the end of the world, but it can bring problems and using them when there is no need, is not a good idea. There are some articles and question like this one for more reference: https://stackoverflow.com/questions/39818452/global-variable-use-case. As for `| Out-Null`, that's fine, it looks cool but if you're an efficiency maniac like me you will want to use `$null = ...` or `[void]` :) – Santiago Squarzon Dec 16 '21 at 14:54
  • thanks, I noticed that the Get-Job Thread* | Remove-Job | Out-Null was removed from the bottom. The reason I had that was because I heard that jobs can finish and not go way if they terminate for some weird reason. So I figured I will name them, then as extra measure, purge any jobs that are named Thread* – Mike Q Dec 16 '21 at 15:41
  • 1
    @MikeQ `-Wait -AutoRemoveJob` should handle this, if for some reason the Job couldn't be removed it will throw an exception but that would be a very rare case. – Santiago Squarzon Dec 16 '21 at 15:42
  • 1
    Ah sorry I missed that change! Thanks! – Mike Q Dec 16 '21 at 15:47
  • @MikeQ happy to help and glad you have learnt something new. If you're interested in learning the different multithreading options in PowerShell and how to apply them check out https://github.com/santysq/Multithreading-Perfomance-on-PowerShell/blob/main/performanceTest.ps1 – Santiago Squarzon Dec 16 '21 at 15:49
  • 1
    Would you declare $result first as a custom object first or is that not necessary $result = $jobs | Receive-Job -Wait -AutoRemoveJob ? Thanks I'll check out that link! – Mike Q Dec 16 '21 at 15:51
  • 1
    @MikeQ `Receive-Job` will serialize and de-serialize the output from the scriptblock when it's receive on the local session, be it an `object[]` or `string` or whichever type. There is no need to constraint or declare. – Santiago Squarzon Dec 16 '21 at 15:53
  • 1
    Oh so looking at my previous question about Remove-Job a bit more, the use case I was trying to cover was if the script was stopped prematurely; it seems I need a stop-job then remove-job because I don't want to overwhelm the server my mistake. Also probably no harm to include that at the end of the script as well. – Mike Q Dec 16 '21 at 16:02