1

When running processes in a PowerShell pipeline, the next process will only start after the previous one exits. Here's a simple command to demonstate:

python -c "from time import *; print(time()); sleep(3)" | python -c "from time import *; print(input()); print(time())"

Which will print something like:

1599497759.5275168
1599497762.5317411

(note that the times are 3 seconds apart).

Is there any way to make the processes run in parallel? Looking for a solution that works on either Windows PowerShell or PowerShell Core on Windows.

I found this question, but it only deals with cmdlets, not normal executables.

jfhr
  • 687
  • 5
  • 13

3 Answers3

2

This is likely PowerShell's native command processor waiting to see if any more output is written before binding it to the downstream command.

Explicitly flushing the output seems to work (tested with Python 3.8 and PowerShell 7.0.1 on Ubuntu 20.04):

python3 -c "from time import *; print(time(), flush=True); sleep(3)" | python3 -c "from time import *; print(input()); print(time())"

Gives me timestamps within 2ms of each other

On Windows, the flush=True option doesn't seem to alleviate the problem, but wrapping the second command in ForEach-Object does:

python3 -c "from time import *; print(time(), flush=True); sleep(3)" |ForEach-Object { $_|python3 -c "from time import *; print(input()); print(time())" }
Mathias R. Jessen
  • 157,619
  • 12
  • 148
  • 206
  • Thanks for the answer, but it doesn't seem to work on Windows (should have specified that's what I'm looking for). Tested on Windows PowerShell 5.1.19041.1 and PowerShell Core 7.0.3 – jfhr Sep 07 '20 at 19:08
  • 1
    @jfhr I managed to get it to work in Windows with `ForEach-Object` as an intermediary, updated the answer – Mathias R. Jessen Sep 07 '20 at 19:51
  • It worked for me with python 2 and powershell 7 `python -c "from time import *; import sys; print(time()); sys.stdout.flush(); sleep(3)" | python -c "from time import *; print(input()); print(time())"` – js2010 Sep 07 '20 at 22:27
0

Using the pipeline in PowerShell is about passing a stream of objects from one synchronous command to another, not asynchronous processing.

If you want just concurrent processing, you could use jobs.

You haven't indicated which version of PowerShell you're using so I'm going to assume just Windows PowerShell.

# Create your script block of code you want to execute in each thread/job
$ScriptBlock = {
    python -c "from time import *; print(time()); sleep(3)"
}

# A trivial example demonstrating that we can create 5 jobs to execute the same script block
$Jobs = 1..5 | ForEach-Object {
    Start-Job -ScriptBlock $ScriptBlock
}

# Wait for jobs to finish executing
Wait-Job $Jobs

# Receive the output data stream from all jobs
Get-Job $Jobs | Receive-Job

Read more about jobs in the about_Jobs help topic:

Get-Help about_Jobs

https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.core/about/about_jobs?view=powershell-5.1

codaamok
  • 717
  • 3
  • 11
  • 21
0

Well, if you want/need to use pipeline I would suggest to use

ForEach-Object -Parallel

PowerShell ForEach-Object Parallel Feature

or you could also use

workflow paralleltest {

 parallel {

  python -c "from time import *; print(time()); sleep(3)" | python -c "from time import *; print(input()); print(time())"

  }

PowerShell Workflows

Adis1102
  • 192
  • 1
  • 11