2

I am trying to pipe output logs to another script for processing. However, if I undestand this issue correctly, powershell waits for first command to finish before sending output to second command.

For example, this works as expected (script receives "hi"), since echo finishes quickly:

echo "hi" | python script.py

While in this one (caddy is webserver, so it doesn't finish) caddy's stdout never reaches script's stdin:

caddy run | python script.py

Is it possible to forward output of first command to second asynchronously, without waiting for first to finish?

--

Edit: minimal example

# output.py

from time import sleep

while True:
    print("SENDING")
    sleep(1)
# onetime.py

print("SENDING")
# input.py
import sys

while True:
    for line in sys.stdin:
        print(f"Received: {line}")

This works (prints to stdout Received: SENDING):

python onetime.py | python input.py

This doesn't work (nothing is printed):

python output.py | python input.py
Matija Sirk
  • 596
  • 2
  • 15
  • 1
    You say `caddy run` "never sends output", so what exactly is that that you want "forwarded" to the python script? – Mathias R. Jessen Apr 20 '23 at 09:30
  • Maybe I was unclear - caddy sends output every few seconds (when request arrives). However script never receives it as input (pipe doesn't forward it). – Matija Sirk Apr 20 '23 at 09:30
  • As per @Mathias' commend, `caddy` isn't native PowerShell but an external command that isn't written according the [Write Single Records to the Pipeline (SC03)](https://learn.microsoft.com/powershell/scripting/developer/cmdlet/strongly-encouraged-development-guidelines#write-single-records-to-the-pipeline-sc03) of the PowerShell Strongly Encouraged Development Guidelines. – iRon Apr 20 '23 at 12:25
  • @iRon so is there any workaround for external commands, given that not all software can be modified inhouse? Is there some way to wrap it into better behaved script? – Matija Sirk Apr 20 '23 at 12:33
  • 1
    @iRon, those guidelines don't apply to _external (native) programs_, which cannot be expected to abide by these rules. PowerShell treats the _lines_ of an external program's stdout output as the objects to send through the pipeline, and - due to a serious design limitation - this doesn't happen in a _streaming_ manner in _Windows PowerShell_ when the _consumer_ is an external program, something that has fortunately been fixed in PowerShell (Core) 7+ – mklement0 Apr 20 '23 at 12:51

1 Answers1

4

Indeed:

  • In Windows PowerShell, when piping to an external program, the input command's output is unexpectedly:

    • collected in full in memory first

    • and the collected output is passed on via the pipeline only after the input command has exited.

  • Fortunately, this has been fixed in PowerShell (Core) 7+, where an input command's output is now streamed, line by line, as the lines become available, as usual. (This is also how it works in Windows PowerShell if the receiving command is a PowerShell command.)

A simple example:

  • Note that it doesn't matter whether the input-providing command is a PowerShell command or an external program - what matters is that the receiving command is an external program

  • The example input command outputs 3 numbers, one at a time, and waits 1 second after each, and finally waits for the user to press Enter before exiting.
    The receiving command, findstr.exe ., simply echoes each number.

# In Windows PowerShell:
#   3 seconds elapse without anything getting printed, because PowerShell
#   is collecting the output first.
#   Then the user is prompted to press Enter.
#   Only if and when the user does so do the numbers print.
# In PowerShell 7+:
#   The numbers print as they are being emitted by the ForEach-Object call, and
#   only then is the user prompted to press Enter.
1..3 | ForEach-Object { $_; sleep 1 } -end { pause } | findstr.exe .

Workaround:

Call via cmd.exe's CLI, (cmd /c), because cmd.exe's pipeline behaves as expected (on Unix-like platforms, analogously use sh -c):

cmd /c 'caddy run | python script.py'

In fact:

  • cmd.exe's (/bin/sh's) pipeline is a raw byte conduit, unlike PowerShell's pipeline, which as of v7.3.4 "speaks only text"

  • Therefore, the above workaround is also needed in cases where sending raw byte data to / between external programs / to a file is necessary - see this answer.

mklement0
  • 382,024
  • 64
  • 607
  • 775