7

I have read several QAs to this problem, but none provided an answer. There is a workaround, which I state here again, but I want to understand and solve the problem.

Problem

The issue is that executing the command git diff reva revb | Out-File mypatch.patch in powershell produces "garbage characters" in place of e.g. German umlauts (├ñ instead of ä).

Investigation

When I perform $Env:LESSCHARSET="utf8" as suggested in some QAs, I do get correct output in the terminal, but once it is redirected to the file mypatch.patch the umlauts (and other characters) are mangled. Even git --no-pager diff reva revb results in correct output in the terminal. But as soon as you want to pipe that to a file, it is wrong. What you see is not what you get!

It seems to me that the input to Out-File is already mangled and thus setting the -Encoding argument does not change anything. I don't think Out-File is to blame here. For instance, the command $mypatch = git diff reva revb (even with --no-pager added before diff) results in a variable where e.g. Euro symbol or umlauts appear mangled (Ôé¼ instead of €) when that variable is printed to the terminal.

I tried powershell 5.1 and the open source powershell core 6.0.4 on Windows 10 (1709). I use git 2.18.0.windows.1. It works fine with the windows commandline (cmd), thus the simple workaround is to call from the powershell console:

Workaround

cmd /c "git diff reva revb > mypatch.patch"

Question

How does this work with powershell only?

Andreas
  • 6,447
  • 2
  • 34
  • 46
  • Did you try avoiding the pipeline: `Out-File -InputObject (git diff reva revb) -Path mypatch.patch -Encoding utf8`? – DarkLite1 Sep 06 '18 at 13:54
  • Visually that might be avoiding the pipeline but it is still going to be used in the background. I suspect the result will be the same. – Matt Sep 06 '18 at 14:05
  • You could do `git diff reva revb | Out-File -Encoding "UTF8" mypatch.patch`, but that will produce a file with a **BOM** (Byte-Order-Mark). If that is unwanted, use `$Utf8NoBom = New-Object System.Text.UTF8Encoding $False; [System.IO.File]::WriteAllLines($MyPath, $MyFile, $Utf8NoBom)` – Theo Sep 06 '18 at 15:22
  • @DarkLite1 This does not solve the problem. – Andreas Sep 07 '18 at 08:19

1 Answers1

5

The problem seems to be caused by a wrong setting of [Console]::OutputEncoding. If it is not set to UTF8, try setting it: [Console]::OutputEncoding = [System.Text.Encoding]::UTF8.

It does not matter if you then use $Env:LESSCHARSET, respectively I believe it's not used anymore.

Andreas
  • 6,447
  • 2
  • 34
  • 46
  • 2
    Au contraire, on my machine (tm), only the `$Env:LESSCHARSET` fix alone is enough and sufficient to make `git diff`'s output render correctly. The OutputEncoding setting does not hurt, but does not fix the symptom of garbled special characters (e.g. `` instead of `ü`). – ojdo Dec 20 '19 at 13:46
  • 1
    @ojdo As I stated, it's rendered correctly in the terminal, but if you redirect the output to a file it's garbage and only the setting in that answer fixed the problem for me. – Andreas Dec 21 '19 at 18:38