1

depending on how i clear a file in powershell, i get garbage.

echo "" > ".\file.txt"
OR
clear-content ".\file.txt"
THEN
[io.file]::AppendAllText("file.txt", "teststring")

Using clear-content, I will get "teststring" in my "file.txt".

Using echo "" > ".\file.txt", I get garbage chinese characters.

WHY? does echo "" > file mess with encoding?

igbgotiz
  • 892
  • 1
  • 8
  • 28

1 Answers1

1

Echo is a Powershell Alias for Write-Output.

I'm not sure what redirecting the output of Write-Output to a file is resulting in.

I can duplicate your results, I get chinese characters, but the encoding of the file seems to be ASCII (I used this function to test the file encoding: http://poshcode.org/2059).

Just running the echo "" > ".\file.txt" puts 6 bytes in the file: 255, 254, 13, 0, 10, 0 whereas running Set-Content .\file.txt "" puts 2 bytes: 13, 10.

But I understand the question is: Why does this command result in this output.

Arluin
  • 594
  • 1
  • 8
  • 21
  • 2
    10 and 13 in ASCII are linefeed and carriage return, together making a Windows line ending, so I guess Set-Content leaves a standard line ending in ASCII encoding, and echo is outputting 3 characters in a 2-bytes-per-character encoding, BOM marker ff fe for unicode (maybe), two-byte linefeed and carriage return. How are you getting it to open and see Chinese characters? – TessellatingHeckler Mar 24 '14 at 19:46
  • I'm using baretail.exe to examine the output file which shows the chinese characters. – Arluin Mar 24 '14 at 20:19
  • 1
    @TessellatingHeckler Indeed, 0xfffe (`ÿþ`) is the Byte Order Mark for a Unicode encoding (little endian encoded UTF-16 to be precise). – Ansgar Wiechers Mar 24 '14 at 22:07
  • @Ansgar Wiechers Thanks. That explains "what" we are getting. Why does redirecting Write-Output to a file cause that output? – Arluin Mar 24 '14 at 22:27
  • @Ansgar Wiechers actually the link in your comment to the question addresses the why. Thanks. – Arluin Mar 24 '14 at 22:31
  • @Arluin Because PowerShell strings are UTF-16 encoded (instances of the [`System.String`](http://msdn.microsoft.com/en-us/library/system.string%28v=vs.110%29.aspx) class), and the output redirection simply doesn't tamper with that encoding unless you override it. – Ansgar Wiechers Mar 24 '14 at 22:32
  • @AnsgarWiechers interesting, thanks. The poshcode.org script that Arluin linked in their answer is looking for that pattern with the bytes the other way around (0xfeff) for 'unicode', and that script tells me the file produced by `echo "">test` is ASCII (wrongly). Is the script just wrong, or is there something about endianness it's looking for that I'm not aware of, do you know? – TessellatingHeckler Mar 24 '14 at 22:53
  • Actually, nevermind, it's in the discussion comments on the C# code linked from the poshcode; 0xfeff for UTF-16 big endian, 0xffee for UTF-16 little endian; the script errs by failing to check for one of them. – TessellatingHeckler Mar 24 '14 at 23:00