1

I'm writing a powershell function to process lines from a log file. The log file is being written to by another process. In order to make viewing the logs easier I'm parsing each log line into an object and passing that along the pipeline. I've added a -Wait parameter which will continue watching for changes instead of exiting.

Note: This is a trimmed down version with hard-coded fields but it shows the same behavior

function Watch-Log($Path, [switch]$Wait) {

    [System.IO.FileStream]$log = $null
    [System.IO.StreamReader]$logReader = $null

    $logPath = Split-Path -Parent $Path
    $logFilter = Split-Path -Leaf $Path
    $watcher = [System.IO.FileSystemWatcher]::new($logPath, $logFilter)
    $watcher.IncludeSubdirectories = $false
    $createdEventIdentifier = "Service_LogFileCreated"
    Register-ObjectEvent $watcher Created -SourceIdentifier $createdEventIdentifier

    $watcher.EnableRaisingEvents = $true

    function Open-ServiceLogStream([ref][System.IO.FileStream]$log, [ref][System.IO.StreamReader]$logReader) {
        if ($log.Value) { $log.Value.Close() }

        $log.Value = [System.IO.File]::Open($Path, "Open", "Read", "ReadWrite,Delete");
        $logReader.Value = [System.IO.StreamReader]::new($log.Value)
    }

    try {
        Open-ServiceLogStream ([ref]$log) ([ref]$logReader)

        while ($true) {
            $line = $logReader.ReadLine()
            if (!$line) {

                if (!$Wait) { return }

                $newLogEvent = Get-Event -SourceIdentifier $createdEventIdentifier -ErrorAction SilentlyContinue

                if ($newLogEvent) {
                    Remove-Event -EventIdentifier $newLogEvent.EventIdentifier
                    Open-ServiceLogStream ([ref]$log) ([ref]$logReader)
                }
                else {
                    Start-Sleep -Milliseconds 100
                }
                continue;
            }

            $lineParts = $line -split " "
            [PSCustomObject]@{ PSTypeName = "Test.LogLine"; Date = $lineParts[0]; Priority = $lineParts[1]; LogMessage = $lineParts[2]; }
        }
    }
    finally {
        Unregister-Event -SourceIdentifier $createdEventIdentifier
    }
}

So it might be used like this:

> Watch-Log test.log | Select Date, Priority, LogMessage

Date               Priority LogMessage
----               -------- ----------
7/10/2023 19:18:50 Message  New Log Opened
7/10/2023 19:19:33 Message  Line Here!
7/10/2023 19:19:33 Message  Line Here!
...

Unfortuantely, if I'm using the -Wait parameter the "output" of the pipeline is buffered forever and never gets printed. Ideally, I would like it to start printing the output in the table, and add new rows as logs are emitted. I'm okay if the column widths stay fixed based on the first log line it sees.

I can trick it into printing stuff immediately by using Out-Host

Watch-Log test.log -Wait | Out-Host

But it's frustrating to require other users of this to know they need to append Out-Host, and if it's wrapped in another function than you can't force that.

Update

I've finally uncovered the source of the problem after being unable to repro the issue on a machine without my default Powershell profile.

A few years ago I was doing something in Powershell and ended up overriding Out-Default in my powershell profile.

I've been dragging this profile around for quite some time (since the Windows Powershell only days I think?) and totally forgot about this.

function Out-Default {
  $input | Tee-Object -var global:last | Microsoft.PowerShell.Core\out-default
}

Since this isn't a pipeline function, it blocks as it gathers everything into $input before passing it onto Tee-Object which is causing the default formatting to break.

Yes, I've been trying to figure this out for days and FINALLY only moments after submitting a bug about it I uncovered the problem! Thanks everyone for helping get me pointed in the right direction.

Ben Randall
  • 1,205
  • 10
  • 27
  • 1
    You forgot to post the code for `Watch-Log`. What does `-Wait` do, exactly? – Mathias R. Jessen Jul 10 '23 at 19:35
  • 1
    remove `Format-Table`, up to 4 properties are displayed as table by default – Santiago Squarzon Jul 10 '23 at 19:36
  • The code for `Watch-Log` isn't important. It's just a function that returns an object every once in a while. It's the filtering and output that I'm interested in. The `-Wait` parameter keeps the log file open and continues to process log lines as they are added (as opposed to just processing log lines that are available and exiting). – Ben Randall Jul 10 '23 at 19:39
  • 1
    @BenRandall The table should automatically start rendering within 300ms after the first object arrives, or after the second object with the same shape arrives, whichever comes first. How long did you wait? – Mathias R. Jessen Jul 10 '23 at 19:41
  • @SantiagoSquarzon, the problem is the same without the Format-Table. With the `-Wait` parameter if I don't append `Write-Host` or something to the pipeline, it blocks until `Watch-Log` exits (which it never does because of `-Wait`) until it prints any output. – Ben Randall Jul 10 '23 at 19:43
  • @MathiasR.Jessen, I've been waiting dozens of seconds (a log line comes in around once every second) and I never see any output. – Ben Randall Jul 10 '23 at 19:44
  • 1
    I think what Mathias said in his comment is important, we need the code for your custom function to understand what is wrong there because `Get-Content -Wait | Select-Object foo, bar, baz` would work just fine streaming lines as they're written to the log – Santiago Squarzon Jul 10 '23 at 19:46
  • 1
    [Here's an example showing the behavior mentioned above](https://gist.github.com/IISResetMe/fde51d6d693a96d1e8673e2b566456b0). Now I've shown you mine, please show us yours :) – Mathias R. Jessen Jul 10 '23 at 19:55
  • 2
    If you're not comfortable sharing the implementation of `Watch-Log`: try writing [the smallest possible function that reproduces the behavior](https://stackoverflow.com/help/minimal-reproducible-example) you're seeing and then share that - it might even lead you to the answer on your own – Mathias R. Jessen Jul 10 '23 at 20:04
  • @MathiasR.Jessen I updated the question with an example Watch-Log that reproduces the behavior. I need to add `Out-Host` to get it to output like you suggested. Is this expected behavior? Do you have details on where the delay before stuff is rendered is describe? – Ben Randall Jul 10 '23 at 20:34
  • 3
    Thanks! Unfortunately I'm still entirely unable to reproduce the behavior your describe - after slightly more than 1 second, your sample code starts rendering the table with the first 2 rows in my console - then another row each second after that. This (expected) behavior is consistent between Windows PowerShell 5.1 and PowerShell 7.3. What version of PowerShell are you using? – Mathias R. Jessen Jul 10 '23 at 20:35
  • 1
    Try to specify column widths explicitly to prevent the initial delay. This works for me: `Watch-Log | Format-Table -Property @{ e='LogMessage'; width=50 }` – zett42 Jul 10 '23 at 20:45
  • @zett42 - No change with column widths – Ben Randall Jul 10 '23 at 20:57
  • @MathiasR.Jessen - Hmm... that is interesting. Even with your code I get the same behavior unless I append Out-Host. I'm currently using Powershell 7.3.4 and seeing this. I wonder if it's just a bug? – Ben Randall Jul 10 '23 at 20:58
  • @BenRandall, `Out-Host` shouldn't make a difference, and I'm not aware of a bug in v7.3.4 (note that 7.3.5 is now available). What environment (terminal application) are you running this in? – mklement0 Jul 10 '23 at 22:01
  • 1
    @MathiasR.Jessen, the 300-millisecond delay is a great pointer, but note that another object with the same properties arriving within that time window does _not_ short-circuit that delay; e.g. `& { [pscustomobject] @{ foo=1 }; [pscustomobject] @{ foo=2}; Start-Sleep 5 } | Format-Table` – mklement0 Jul 10 '23 at 22:03
  • Okay, I've updated the example with a more detailed specific implementation of Watch-Log that's closer to my implementation. – Ben Randall Jul 15 '23 at 05:57

1 Answers1

2

The likely culprit is the infamous 300-millisecond delay that Format-Table uses in order to collect enough input data so as to make a reasonable guess how wide the output columns should be, before it starts to produce display output.

  • See this answer for a detailed explanation.

  • The most common gripe about this behavior is that - when Format-Table is used implicitly - it can interfere with output ordering across PowerShell's different output streams - see this answer.


However, as the following example shows, the 300-millisecond delay can result in potentially much longer delays, and even potentially in never producing display output:[1]

# Note: Omitting Format-Table exhibits the same behavior, because
#       an object with 4 or fewer properties is *implicitly* table-formatted,
#       unless the default formatting is governed by predefined formatting
#       data associated with the object's type.
& { [pscustomobject] @{ foo = 1 }; Start-Sleep 10 } | Format-Table

This doesn't print the table until 10 seconds later, namely after the Start-Sleep call returns and the input command has terminated.

The problem also occurs with use of cmdlets that wait for an unspecified period for an event to occur, namely Wait-Job, Wait-Process and Wait-Event.


The problem is that Format-Table doesn't regain control until it either receives the next input object or the input command terminates.


Workarounds:

The 300-millisecond delay can be avoided if Format-Table knows the column widths ahead of time, which can be achieved in one of two ways:

  • Specify the column widths via calculated properties, as zett42 recommends:

    # Starts to produce display output *instantly*, 
    # due to the predefined column width.
    & { [pscustomobject] @{ foo = 1 }; Start-Sleep 10 } | 
      Format-Table -Property @{ Expression='foo'; Width = 10 }
    
    • Unfortunately, you'll have to do that for every output property (column).

    • Also, any explicit use of a Format-* cmdlet invariably interferes with using the output as data - see this answer.

  • Associate predefined formatting data with the .NET type of your output objects and define a table view with fixed column widths.

    • This is the preferable solution - not least because it doesn't interfere with outputting data - but requires the nontrivial effort of authoring a formatting file (*._Format.ps1xml), which must be loaded into the session first.

    • Additionally, you must make Write-Log either use a specific .NET type associated with your formatting data or add a PSTypeName property to [pscustomobject] output objects (e.g, [pscustomobject] @{ PSTypeName = 'My.Type'; foo = 1 })


[1] Strictly speaking, an initial, empty line always prints, to be followed by the actual table only after the 300-millisecond delay has elapsed.

mklement0
  • 382,024
  • 64
  • 607
  • 775
  • Okay, finally got a chance to try this out and it still doesn't seem to work. I tried both adding `Format-Table` with explicit widths, as well as a Format.ps1xml. I know the format file is loaded properly because loading without `-Wait` formats it as a table instead of a list (as expected), but with` -Wait` it still just hangs. I've added some debug lines to verify that it's still processing and outputting stuff even though it's blocked. – Ben Randall Jul 14 '23 at 21:27
  • @BenRandall, given that your own repro code in the question doesn't reproduce the symptom for others, there must be something unusual about your execution environment you're not mentioning. The solution in this question works in regular console windows, in Windows Terminal, in the (obsolescent) PowerShell ISE, and in Visual Studio Code's integrated terminal. It also works in macOS and Linux terminals (as verified via WSL). – mklement0 Jul 14 '23 at 21:36
  • @BenRandall, I'm also confused about "formats it as a table instead of a list": with _list_ formatting, the problem wouldn't even arise, because no delay is ever involved when list formatting is used; try `& { [pscustomobject] @{ foo = 1 }; Start-Sleep 10 } | Format-List` – mklement0 Jul 14 '23 at 21:42
  • Sorry, I meant that by default (without format data) it formats it as a list. But _with_ the format data, it formats it as a table by default without me needing to append `Format-Table` which proves the format data is being used. – Ben Randall Jul 14 '23 at 22:52
  • I agree it's weird that the behavior is different for others. Right now I'm mostly just trying to see if I can figure out why that's the case. – Ben Randall Jul 14 '23 at 22:53
  • @BenRandall, I've tried with your updated sample code, and everything seems to work as expected: using `... | Out-Host` makes _no_ difference (it shouldn't), but `... | Format-Table @{e='Date'; width=10}, @{e='Priority'; width=10}, @{e='LogMessage'; width=59}` solves the problem - note how _all_ output columns need an explicit width, and the same goes for defining formatting data. – mklement0 Jul 15 '23 at 13:17
  • 1
    I'm accepting this as the answer since I believe it's an accurate description of what's needed to solve the problem that I'm asking. I know that I've got some weird config on my machine that causing something to work improperly and I'm going to try and figure that out separately. – Ben Randall Jul 15 '23 at 22:28