150

I have a powershell script to do some batch processing on a bunch of images and I'd like to do some parallel processing. Powershell seems to have some background processing options such as start-job, wait-job, etc, but the only good resource I found for doing parallel work was writing the text of a script out and running those (PowerShell Multithreading)

Ideally, I'd like something akin to parallel foreach in .net 4.

Something pretty seemless like:

foreach-parallel -threads 4 ($file in (Get-ChildItem $dir))
{
   .. Do Work
}

Maybe I'd be better off just dropping down to c#...

Alan Jackson
  • 6,361
  • 2
  • 31
  • 32
  • **tl;dr:** `receive-job (wait-job ($a = start-job { "heyo!" })); remove-job $a` or `$a = start-job { "heyo!" }; wait-job $a; receive-job $a; remove-job $a` Note also that if you call `receive-job` before the job is finished, you might get nothing at all. – Andrew Nov 22 '17 at 18:47
  • Also `(get-job $a).jobstateinfo.state;` – Andrew Nov 22 '17 at 19:02

10 Answers10

112

You can execute parallel jobs in Powershell 2 using Background Jobs. Check out Start-Job and the other job cmdlets.

# Loop through the server list
Get-Content "ServerList.txt" | %{

  # Define what each job does
  $ScriptBlock = {
    param($pipelinePassIn) 
    Test-Path "\\$pipelinePassIn\c`$\Something"
    Start-Sleep 60
  }

  # Execute the jobs in parallel
  Start-Job $ScriptBlock -ArgumentList $_
}

Get-Job

# Wait for it all to complete
While (Get-Job -State "Running")
{
  Start-Sleep 10
}

# Getting the information back from the jobs
Get-Job | Receive-Job
Nino Filiu
  • 16,660
  • 11
  • 54
  • 84
Steve Townsend
  • 53,498
  • 9
  • 91
  • 140
  • 3
    So I tried this suggestion several times, but it seems that my variables aren't getting expanded correctly. To use the same example, when this line executes: `Test-Path "\\$_\c$\Something"` I would expect it to expand `$_` into the current item. However, it doesn't. Instead it returns an empty value. This only seems to happen from within script blocks. If I write that value out immediately after the first comment, it seems to work correctly. – rjg Jul 20 '11 at 18:13
  • 1
    @likwid - sounds like a separate question for the site – Steve Townsend Jul 22 '11 at 00:59
  • How can I view the output of the job which is running in background ? – SimpleGuy Jan 09 '17 at 10:53
  • @SimpleGuy - see here for info on output capture - http://stackoverflow.com/questions/15605095/powershell-where-does-start-job-output-go - does not seem like you can view this reliably until the background job completes. – Steve Townsend Jan 10 '17 at 15:25
  • @SteveTownsend Thanks ! Actually viewing output is a not so good on screen. Comes with delay, so not useful for me. Instead I started a process on new terminal (shell), so now each process is running on different terminal which gives the view of progress much better and much cleaner. – SimpleGuy Jan 11 '17 at 07:28
  • In a scriptblock, if there are Variables in it like so: $ScriptBlock = { param($pipelinePassIn) Test-Path "\\$pipelinePassIn\c`$\Something" Start-Sleep 60 } You can only pass it as arguments, means we have to pass 2 params, in the same order in which they are in the scriptblock. – Ilya Gurenko Jul 29 '20 at 10:42
105

The answer from Steve Townsend is correct in theory but not in practice as @likwid pointed out. My revised code takes into account the job-context barrier--nothing crosses that barrier by default! The automatic $_ variable can thus be used in the loop but cannot be used directly within the script block because it is inside a separate context created by the job.

To pass variables from the parent context to the child context, use the -ArgumentList parameter on Start-Job to send it and use param inside the script block to receive it.

cls
# Send in two root directory names, one that exists and one that does not.
# Should then get a "True" and a "False" result out the end.
"temp", "foo" | %{

  $ScriptBlock = {
    # accept the loop variable across the job-context barrier
    param($name) 
    # Show the loop variable has made it through!
    Write-Host "[processing '$name' inside the job]"
    # Execute a command
    Test-Path "\$name"
    # Just wait for a bit...
    Start-Sleep 5
  }

  # Show the loop variable here is correct
  Write-Host "processing $_..."

  # pass the loop variable across the job-context barrier
  Start-Job $ScriptBlock -ArgumentList $_
}

# Wait for all to complete
While (Get-Job -State "Running") { Start-Sleep 2 }

# Display output from all jobs
Get-Job | Receive-Job

# Cleanup
Remove-Job *

(I generally like to provide a reference to the PowerShell documentation as supporting evidence but, alas, my search has been fruitless. If you happen to know where context separation is documented, post a comment here to let me know!)

Michael Sorens
  • 35,361
  • 26
  • 116
  • 172
  • Thanks for this answer. I tried using your solution, but I was unable to get it fully working. Can you take a look at my question here: http://stackoverflow.com/questions/28509659/unzipping-works-on-singlethread-but-not-multithread – David says Reinstate Monica Feb 13 '15 at 22:23
  • Alternatively, it's pretty easy to invoke a separate script file. Just use `Start-Job -FilePath script.ps1 -ArgumentList $_` – Chad Zawistowski Jun 16 '16 at 22:54
  • An alternative approach is to do a preliminary pass of script generation, where nothing is being done but variable expansion, and then invoke the generated scripts in parallel. I have a little tool that might be adapted to script generation, although it was never meant to support script generation. You can see it [here](https://stackoverflow.com/questions/42230306/how-to-combine-a-template-with-a-csv-file-in-powershell). – Walter Mitty Jul 27 '19 at 11:43
  • This works. But I can't get live feed output stream from ScriptBlock. The output only gets printed when ScriptBlock returns. – vothaison Apr 14 '20 at 15:46
29

There's so many answers to this these days:

  1. jobs (or threadjobs in PS 6/7 or the module for PS 5)
  2. start-process
  3. workflows (PS 5 only)
  4. powershell api with another runspace
  5. invoke-command with multiple computers, which can all be localhost (have to be admin)
  6. multiple session (runspace) tabs in the ISE, or remote powershell ISE tabs
  7. Powershell 7 has a foreach-object -parallel as an alternative for #4

Using start-threadjob in powershell 5.1. I wish this worked like I expect, but it doesn't:

# test-netconnection has a miserably long timeout
echo yahoo.com facebook.com | 
  start-threadjob { test-netconnection $input } | receive-job -wait -auto

WARNING: Name resolution of yahoo.com microsoft.com facebook.com failed

It works this way. Not quite as nice and foreach-object -parallel in powershell 7 but it'll do.

echo yahoo.com facebook.com | 
  % { $_ | start-threadjob { test-netconnection $input } } | 
  receive-job -wait -auto | ft -a

ComputerName RemotePort RemoteAddress PingSucceeded PingReplyDetails (RTT) TcpTestS
                                                                           ucceeded
------------ ---------- ------------- ------------- ---------------------- --------
facebook.com 0          31.13.71.36   True          17 ms                  False
yahoo.com    0          98.137.11.163 True          97 ms                  False

Here's workflows with literally a foreach -parallel:

workflow work {
  foreach -parallel ($i in 1..3) { 
    sleep 5 
    "$i done" 
  }
}

work

3 done
1 done
2 done

Or a workflow with a parallel block:

function sleepfor($time) { sleep $time; "sleepfor $time done"}

workflow work {
  parallel {
    sleepfor 3
    sleepfor 2
    sleepfor 1
  }
  'hi'
}
    
work 

sleepfor 1 done
sleepfor 2 done
sleepfor 3 done
hi

Here's an api with runspaces example:

$a =  [PowerShell]::Create().AddScript{sleep 5;'a done'}
$b =  [PowerShell]::Create().AddScript{sleep 5;'b done'}
$c =  [PowerShell]::Create().AddScript{sleep 5;'c done'}
$r1,$r2,$r3 = ($a,$b,$c).begininvoke() # run in background
$a.EndInvoke($r1); $b.EndInvoke($r2); $c.EndInvoke($r3) # wait
($a,$b,$c).streams.error # check for errors
($a,$b,$c).dispose() # clean

a done
b done
c done
js2010
  • 23,033
  • 6
  • 64
  • 66
  • 3
    Might be worth noting that workflows are not supported since PowerShell 6, i.e. since PowerShell switched to .NET Core. They are built on top of Windows Workflow Foundation, which is not available in .NET Core. There is the "PS 5 only" remark, but it is easy to overlook. – Palec May 27 '21 at 20:26
  • 1
    Start-Process typically creates a new window. With -NoNewWindow, the process gets direct access to the current console. That is unlike jobs and ForEach-Object -Parallel, which somehow process the output. – Palec May 27 '21 at 20:57
12

In Powershell 7 you can use ForEach-Object -Parallel

$Message = "Output:"
Get-ChildItem $dir | ForEach-Object -Parallel {
    "$using:Message $_"
} -ThrottleLimit 4
izharsa
  • 153
  • 1
  • 8
8

http://gallery.technet.microsoft.com/scriptcenter/Invoke-Async-Allows-you-to-83b0c9f0

i created an invoke-async which allows you do run multiple script blocks/cmdlets/functions at the same time. this is great for small jobs (subnet scan or wmi query against 100's of machines) because the overhead for creating a runspace vs the startup time of start-job is pretty drastic. It can be used like so.

with scriptblock,

$sb = [scriptblock] {param($system) gwmi win32_operatingsystem -ComputerName $system | select csname,caption} 

$servers = Get-Content servers.txt 

$rtn = Invoke-Async -Set $server -SetParam system  -ScriptBlock $sb

just cmdlet/function

$servers = Get-Content servers.txt 

$rtn = Invoke-Async -Set $servers -SetParam computername -Params @{count=1} -Cmdlet Test-Connection -ThreadCount 50
jrich523
  • 598
  • 1
  • 5
  • 19
7

Backgrounds jobs are expensive to setup and are not reusable. PowerShell MVP Oisin Grehan has a good example of PowerShell multi-threading.

(10/25/2010 site is down, but accessible via the Web Archive).

I'e used adapted Oisin script for use in a data loading routine here:

http://rsdd.codeplex.com/SourceControl/changeset/view/a6cd657ea2be#Invoke-RSDDThreaded.ps1

jpaugh
  • 6,634
  • 4
  • 38
  • 90
Chad Miller
  • 40,127
  • 3
  • 30
  • 34
6

To complete previous answers, you can also use Wait-Job to wait for all jobs to complete:

For ($i=1; $i -le 3; $i++) {
    $ScriptBlock = {
        Param (
            [string] [Parameter(Mandatory=$true)] $increment
        )

        Write-Host $increment
    }

    Start-Job $ScriptBlock -ArgumentList $i
}

Get-Job | Wait-Job | Receive-Job
Thomas
  • 24,234
  • 6
  • 81
  • 125
5

If you're using latest cross platform powershell (which you should btw) https://github.com/powershell/powershell#get-powershell, you can add single & to run parallel scripts. (Use ; to run sequentially)

In my case I needed to run 2 npm scripts in parallel: npm run hotReload & npm run dev


You can also setup npm to use powershell for its scripts (by default it uses cmd on windows).

Run from project root folder: npm config set script-shell pwsh --userconfig ./.npmrc and then use single npm script command: npm run start

"start":"npm run hotReload & npm run dev"
GorvGoyl
  • 42,508
  • 29
  • 229
  • 225
2

This has been answered thoroughly. Just want to post this method i have created based on Powershell-Jobs as a reference.

Jobs are passed on as a list of script-blocks. They can be parameterized. Output of the jobs is color-coded and prefixed with a job-index (just like in a vs-build-process, as this will be used in a build) Can be used to startup multiple servers at a time or running build steps in parallel or so..

function Start-Parallel {
    param(
        [ScriptBlock[]]
        [Parameter(Position = 0)]
        $ScriptBlock,

        [Object[]]
        [Alias("arguments")]
        $parameters
    )

    $jobs = $ScriptBlock | ForEach-Object { Start-Job -ScriptBlock $_ -ArgumentList $parameters }
    $colors = "Blue", "Red", "Cyan", "Green", "Magenta"
    $colorCount = $colors.Length

    try {
        while (($jobs | Where-Object { $_.State -ieq "running" } | Measure-Object).Count -gt 0) {
            $jobs | ForEach-Object { $i = 1 } {
                $fgColor = $colors[($i - 1) % $colorCount]
                $out = $_ | Receive-Job
                $out = $out -split [System.Environment]::NewLine
                $out | ForEach-Object {
                    Write-Host "$i> "-NoNewline -ForegroundColor $fgColor
                    Write-Host $_
                }
                
                $i++
            }
        }
    } finally {
        Write-Host "Stopping Parallel Jobs ..." -NoNewline
        $jobs | Stop-Job
        $jobs | Remove-Job -Force
        Write-Host " done."
    }
}

sample output:

sample output

Dharman
  • 30,962
  • 25
  • 85
  • 135
Chris
  • 527
  • 3
  • 15
  • How come the 7th line shows a blue `i` symbol. Doesn't the script lose all color information of the output of underlying job scripts? – Monsignor Oct 22 '20 at 11:54
  • when i remember correctly i didnt touch the script after generating the output. so i assume the color is not stripped. unfortunately pwsh is not very consistent when it comes to console colors, thus i am not sure at all – Chris Oct 22 '20 at 14:14
2

There is a new built-in solution in PowerShell 7.0 Preview 3. PowerShell ForEach-Object Parallel Feature

So you could do:

Get-ChildItem $dir | ForEach-Object -Parallel {

.. Do Work
 $_ # this will be your file

}-ThrottleLimit 4
Olaf
  • 146
  • 1
  • 8