1

I use nuget-tree to inspect the nuget dependencies in our code. It produces output like this:

C:\xyz\tip [master ≡]> $p |? { Test-Path "$_\..\packages.config" } | select -First 1 |% { pushd "$_\.."; "$_" ; nuget-tree.cmd --showSystem ; popd }
C:\xyz\tip\BI\a8i\a8i.csproj
packages.config
└── Newtonsoft.Json 11.0.2

C:\xyz\tip [master ≡]>

enter image description here

I would like to save the output to file, but I can't figure out what encoding to use to preserve the nice hierarchy ascii art. Please, observe:

C:\xyz\tip [master ≡]> $p |? { Test-Path "$_\..\packages.config" } | select -First 1 |% { pushd "$_\.."; "$_" ; nuget-tree.cmd --showSystem ; popd } | Out-File c:\temp\1.txt ; cat c:\temp\1.txt
C:\xyz\tip\BI\a8i\a8i.csproj
packages.config
ΓööΓöÇΓöÇ Newtonsoft.Json 11.0.2

C:\xyz\tip [master ≡]> $p |? { Test-Path "$_\..\packages.config" } | select -First 1 |% { pushd "$_\.."; "$_" ; nuget-tree.cmd --showSystem ; popd } | Out-File c:\temp\1.txt -Encoding ascii ; cat c:\temp\1.txt
C:\xyz\tip\BI\a8i\a8i.csproj
packages.config
????????? Newtonsoft.Json 11.0.2

C:\xyz\tip [master ≡]> $p |? { Test-Path "$_\..\packages.config" } | select -First 1 |% { pushd "$_\.."; "$_" ; nuget-tree.cmd --showSystem ; popd } | Out-File c:\temp\1.txt -Encoding unicode ; cat c:\temp\1.txt
C:\xyz\tip\BI\a8i\a8i.csproj
packages.config
ΓööΓöÇΓöÇ Newtonsoft.Json 11.0.2

C:\xyz\tip [master ≡]> $p |? { Test-Path "$_\..\packages.config" } | select -First 1 |% { pushd "$_\.."; "$_" ; nuget-tree.cmd --showSystem ; popd } | Out-File c:\temp\1.txt -Encoding utf8 ; cat c:\temp\1.txt
C:\xyz\tip\BI\a8i\a8i.csproj
packages.config
ΓööΓöÇΓöÇ Newtonsoft.Json 11.0.2

C:\xyz\tip [master ≡]>

How can it be done (except for copy/paste of the console window content) ?

mklement0
  • 382,024
  • 64
  • 607
  • 775
mark
  • 59,016
  • 79
  • 296
  • 580
  • try using `Add-Content` OR `Set-Content` ... i think your problem is the way that `Out-File` tags the output file. i _think_ the 1st two include a ByteOrderMark and the 3rd one does not ... but i may have that reversed. – Lee_Dailey Sep 05 '19 at 02:52
  • @Lee_Dailey: The problem happens _before_ the output is sent to a file: on Windows, the UTF-8-encoded Node.js output is misinterpreted when PowerShell reads it into strings _in memory_, due to instead assuming that the output is OEM code page-encoded, based on the default value of `cp` / `[console]::OutputEncoding`. – mklement0 Sep 05 '19 at 03:17

1 Answers1

2

Node.js-based CLIs output UTF-8-encoded text, so you must set [Console]::OutputEncoding (temporarily) to that encoding to ensure that PowerShell reads the output correctly:

[Console]::OutputEncoding = [System.Text.UTF8Encoding]::new()

Once read by PowerShell, you can use > / Out-File to write the text to a flie using UTF-16LE ("Unicode"; Windows PowerShell) / BOM-less UTF-8 (PowerShell Core), or use Set-Content -Encoding Unicode or Set-Content -Encoding Utf8, though note that the latter will create a file with BOM in Windows PowerShell.


Note that direct-to-console output, in which case no interpretation by PowerShell is involved, may display the characters correctly, irrespective of what [Console]::OutputEncoding is set to.

For more information, see this answer.

mklement0
  • 382,024
  • 64
  • 607
  • 775
  • @mark: I see; ideally, you'd use an option that prevents colored output to begin with, but that doesn't seem to exist in this case; stripping the codes after the fact is probably nontrivial, if it is support all possible VT sequences; certainly worth a new question. – mklement0 Sep 05 '19 at 03:39