2

I'm trying to store the stdout and stderr outputs of a command to two separate files. I'm doing this like so:

powershell.exe @_cmd 2>"stderr.txt" >"stdout.txt"

Where $_cmd is an arbitrary string command.

This works, but the output files have newlines appended after the output. I'd like to modify this to eliminate the newlines. I know you can use cmd | Out-File ... -NoNewline or [System.IO.File]::WriteAllText(..., [System.Text.Encoding]::ASCII), but I'm not sure how to accomplish this with the stderr output.

EDIT: I've realized that the issue isn't the trailing new line specifically (although I still want to remove it), but the fact that I need the output file to be UTF-8 encoded. The trailing new line is not a valid UTF-8 character apparently, which is what's causing me grief. Perhaps there's a way to capture the stderr and stdout to separate variables, and then use Out-File -Encoding utf8?

mklement0
  • 382,024
  • 64
  • 607
  • 775
Jordan
  • 3,998
  • 9
  • 45
  • 81
  • 2
    You should look into `Start-Process` and the `-RedirectStandardError` and `-RedirectStandardOutput` parameters. – TheMadTechnician Jun 16 '21 at 21:48
  • You only want a _trailing_ newline removed from the resulting files? – mklement0 Jun 16 '21 at 21:54
  • @TheMadTechnician It looks like `Start-Process` only takes a file input, not a raw command string? I need it to take a string. @mklement0 Yes, just the trailing newline. – Jordan Jun 16 '21 at 22:07
  • @mklement0 to clarify, I need it to be a UTF-8 encoded file. It seems that the final newline that it appends is not a valid UTF-8 character. – Jordan Jun 16 '21 at 22:08
  • 1
    This might be a dumb question but, what is the reason of executing the commands with `powershell.exe -c` and not call `@_cmd` in a ps script with a `try` `catch` block? – Santiago Squarzon Jun 16 '21 at 22:14
  • 1
    In _Windows PowerShell_, `>` produces "Unicode" (UTF-16LE) files by default, whereas PowerShell (Core) 7+ uses BOM-less UTF-8 consistently. In Windows PowerShell 5.1 (and also in PowerShell (Core)) you can change the default encoding via `$PSDefaultParameterValues`, as explained in [this answer](https://stackoverflow.com/a/51847431/45375). However, note that `>` in PowerShell never passes raw byte output through, it is always subject to decoding into .NET strings first (based on `[Console]::OutputEncoding`, and re-encoding on output to a file. – mklement0 Jun 16 '21 at 22:18
  • 1
    Good point, @SantiagoSquarzon; if this is being called _from PowerShell_, there is no need to use the (Windows) PowerShell CLI, `powershell.exe`, and `& $_cmd[0] $_cmd[1..($_cmd.Count-1)] 2>"stderr.txt" >"stdout.txt"` should do. – mklement0 Jun 16 '21 at 22:34
  • @mkelement0 yes, the powershell.exe is unnecessary. Changing it to the suggested format doesn't solve the core UTF-8 encoding issue though. – Jordan Jun 17 '21 at 03:28
  • The solution to that is in my previous comment and its linked answer. @TheMadTechnician's [`Start-Process`](https://learn.microsoft.com/powershell/module/microsoft.powershell.management/start-process) approach is also an option, especially if you want _BOM-less_ UTF-8 files in Windows PowerShell. – mklement0 Jun 17 '21 at 04:02
  • P.S. Either way you'll have to trim a trailing newline from the files' content after the fact. – mklement0 Jun 17 '21 at 04:10
  • @mklement0 What's the tangible difference between `@_cmd` and `$_cmd[0] $_cmd[1..($_cmd.Count-1)]`? I'm not overly familiar with splatting. – Jordan Jun 19 '21 at 01:18
  • @Jordan, `@_cmd` is for splatting _arguments_ being passed to a _command_ (executable), but splatting doesn't support including the command in the array, which must always be specified separately, explicitly, such as `$_cmd[0]` in this case, which - as a variable-based command - in turn requires `&` for invocation. You don't strictly need splatting for calls to _external programs_, where a regular array - such as `$_cmd[1..($_cmd.Count-1)]` - will do, and, conversely, for actual splatting you may only use a (`@`-prefixed) _variable name_, not an _expression_ – mklement0 Jun 19 '21 at 01:40

2 Answers2

1

Your own Start-Process-based solution that uses -RedirectStandardOutput and -RedirectStandardError indeed creates (BOM-less) UTF-8-encoded output files, but note that they too invariably have a trailing newline.

However, you do not need Start-Process, as you can make PowerShell's redirection operator, > produce UTF-8 files (also with a trailing newline) too.

The following examples use a sample cmd.exe call that produces both stdout and stderr output.

  • In PowerShell (Core) v6+, no extra effort is needed, because > produces (BOM-less) UTF-8 files by default (a default that is used consistently; if you want UTF-8 with a BOM, you can use the technique detailed for Windows PowerShell below, but with value 'utf8bom'):

    cmd /c 'echo hü & dir c:\nosuch' 2>stderr.txt >stdout.txt
    
  • In Windows PowerShell, > produces UTF-16LE ("Unicode") by default, but in version 5.1 you can (temporarily) reconfigure it use UTF-8 instead, albeit invariably with a BOM; see this answer for details; another caveat is that the first stderr line captured in the file will be formatted "noisily", like a PowerShell error:

    # Windows PowerShell v5.1:
    # Make `>` and its effective alias, Out-File, use UTF-8 with a BOM in the
    # remainder of the session.
    # Save and restore any previous value if you want to scope the behavior
    # to select commands only.
    $PSDefaultParameterValues['Out-File:Encoding'] = 'utf8'
    
    cmd /c 'echo hü & dir c:\nosuch' 2>stderr.txt >stdout.txt
    

Caveat:

  • Whenever PowerShell processes an external program's output, it invariably decodes it into .NET strings first. Any external program is assumed to produce output based on the character encoding stored in [Console]::OutputEncoding, which defaults to the system's active OEM code page. This works as expected with cmd.exe, but there are other console applications that use different encodings - notably node.exe (Node.js) and python, which use UTF-8 and the system's active ANSI code page, respectively - in which case [Console]::OutputEncoding must be set to that encoding first; see this answer for more information.

As for your statements and questions:

The trailing new line is not a valid UTF-8 character apparently

PowerShell's > operator and file-output cmdlets apply their character encoding consistently, so the trailing newline's encoding is always consistent with that of the other characters in the file.

Most likely it was the UTF-16LE ("Unicode") encoding used by Windows PowerShell by the default that was the true problem, and you may have only noticed it with respect to the newline.

Perhaps there's a way to capture the stderr and stdout to separate variables

Stdout can be captured by a simple variable assignment, which captures multiple output lines as an array of strings:

$stdout = cmd /c 'echo hü & dir c:\nosuch'

You cannot separately capture stderr output, but you can merge stderr into stdout with 2>&1 and even later separate the streams' respective output lines again, based on their data types: stdout lines are always strings, whereas stderr lines are always [ErrorRecord] instances:

# Note the 2>&1 redirection.
$stdoutAndErr = cmd /c 'echo hü & dir c:\nosuch' 2>&1

# If desired, you can split the captured output into stdout and stderr output.
# The [string[]] cast converts the [ErrorRecord] instances to strings too.
$stdout, [string[]] $stderr = $stdoutAndErr.Where({ $_ -is [string] }, 'Split')

# Now $stdout is the array of stdout lines, and $stderr the array of stderr lines.
# If desired, you could write them to files *without a trailing newline* as follows:
$stdout -join [Environment]::NewLine | Set-Content -NoNewLine -Encoding utf8 stdout.txt
$stderr -join [Environment]::NewLine | Set-Content -NoNewLine -Encoding utf8 stderr.txt

You can also apply these techniques to PowerShell-native commands (and you can even merge all other streams that PowerShell supports into the success output stream, PowerShell's analog to stdout, with *>&1).

However, if a given PowerShell-native command is a cmdlet / advanced script or function, the more convenient alternative is to use the common -OutVariable parameter (for success-stream output) and common -ErrorVariable parameter (for error-stream output).

mklement0
  • 382,024
  • 64
  • 607
  • 775
0

@TheMadTechnician's comment held the answer that worked.

$process = Start-Process powershell.exe -ArgumentList "$_cmd" -Wait -PassThru -NoNewWindow -RedirectStandardError "stderr.txt" -RedirectStandardOutput "stdout.txt"
$exitcode = $process.ExitCode
Jordan
  • 3,998
  • 9
  • 45
  • 81