4

I'm attempting to update the hosts file on a Windows server and trying to do it using a heredoc in powershell.

I can't figure out why my result has extra spaces between every character in each hosts entry.

I'm porting some scripting from Linux.

PS C:\Users\Administrator> cat C:\Users\Administrator\AppData\Local\Temp\etchosts.ps1
@"
127.0.0.1 src.example.com
127.0.0.2 builds.example.com
127.0.0.3 ti.example.com
127.0.0.4 jira.example.com
"@ >>C:\Windows\System32\drivers\etc\hosts



PS C:\Users\Administrator> powershell C:\Users\Administrator\AppData\Local\Temp\etchosts.ps1
PS C:\Users\Administrator> cat C:\Windows\System32\drivers\etc\hosts
# Copyright (c) 1993-2009 Microsoft Corp.
#
# This is a sample HOSTS file used by Microsoft TCP/IP for Windows.
#
# This file contains the mappings of IP addresses to host names. Each
# entry should be kept on an individual line. The IP address should
# be placed in the first column followed by the corresponding host name.
# The IP address and the host name should be separated by at least one
# space.
#
# Additionally, comments (such as these) may be inserted on individual
# lines or following the machine name denoted by a '#' symbol.
#
# For example:
#
#      102.54.94.97     rhino.acme.com          # source server
#       38.25.63.10     x.acme.com              # x client host

# localhost name resolution is handled within DNS itself.
#       127.0.0.1       localhost
#       ::1             localhost
 1 2 7 . 0 . 0 . 1   s r c . e x a m p l e . c o m

 1 2 7 . 0 . 0 . 2   b u i l d s . e x a m p l e . c o m

 1 2 7 . 0 . 0 . 3   t i . e x a m p l e . c o m

 1 2 7 . 0 . 0 . 4   j i r a . e x a m p l e . c o m

I expected no spaces between all the characters. If there is a "Windows" way to do this, I would appreciate any input/suggestions.

phuclv
  • 37,963
  • 15
  • 156
  • 475
mikedoy
  • 117
  • 1
  • 6

2 Answers2

6

A here-string is just a special form of a PowerShell string literal, and like all strings in PowerShell and .NET (System.String), their in-memory encoding is always UTF-16.

  • As an aside: For a string literal to be read into memory correctly, the enclosing script file must be properly encoded; the best choice is UTF-8 with a BOM - see this answer.

However, what matters is how you write an (in-memory) string to a file.

>> file is effectively the same as | Out-File -Append file, and in Windows PowerShell Out-File defaults to UTF16-LE encoding ("Unicode"), where each character is (typically) encoded with 2 bytes. What appear to be spaces are actually the NUL (0x0) bytes in the 2nd byte of the encoding of each ASCII-range character.

  • As an aside: In PowerShell Core, BOM-less UTF-8 is the - much more sensible - default; since UTF-8 is backward-compatible with characters in the ASCII range, your code would have worked fine in PowerShell Core.

By contrast, C:\Windows\System32\drivers\etc\hosts is ASCII-encoded (1 byte per character).

To match that encoding, use Add-Content instead of >>:

@"
127.0.0.1 src.example.com
127.0.0.2 builds.example.com
127.0.0.3 ti.example.com
127.0.0.4 jira.example.com
"@ | Add-Content C:\Windows\System32\drivers\etc\hosts

Unlike Out-File -Append, Add-Content matches the encoding of a file's preexisting content (and, if there is none, defaults to the active ANSI code page's encoding ("Default") in Windows PowerShell, like Set-Content); in the absence of a BOM, as in this case, ANSI encoding is assumed, but with ASCII-range-only input characters that is effectively the same as ASCII, given that ANSI code pages are a superset of ASCII.


See also:

mklement0
  • 382,024
  • 64
  • 607
  • 775
  • I feel like we need to take your best, most thorough post about encoding and sticky it somewhere. This comes up A LOT. – AdminOfThings Aug 16 '19 at 23:20
  • Thanks, @AdminOfThings. Ideally, the official documentation should provide this information, and there's an [open suggestion to add that](https://github.com/MicrosoftDocs/PowerShell-Docs/issues/3190). In the meantime, [this answer](https://stackoverflow.com/a/40098904/45375) probably provides the most comprehensive overview. – mklement0 Aug 16 '19 at 23:33
  • Thanks. This really helps. I was about to go down the path of installing wsl but since this is a script to configure an aws instance that install requires a reboot which complicates things. I do have a related question. I have this code in the userdata for the instance: @' {{.sshconfig}} '@ >C:\ProgramData\ssh\sshd_config ... and this seems to have the correct encoding, Do you know why? (the {{.sshconfig}} gets replaced) – mikedoy Aug 17 '19 at 13:49
  • @mikedoy: Glad to hear it. `>C:\ProgramData\ssh\sshd_config` would only work as expected (with ASCII-range-only chars.) if you used PowerShell _Core_, or, in Windows PowerShell v5.1, if you changed `>`'s encoding globally to ASCII or ANIS ("Default"), as shown in [this answer](https://stackoverflow.com/a/40098904/45375). – mklement0 Aug 17 '19 at 14:27
1

I would never use "out-file -append" or ">>". It doesn't check what the current encoding is. This is a terrible idiosyncrasy of powershell 5. Now you have a file mixed with ascii and unicode. The spaces are actually nulls. I prefer add-content in this case. Add-content will check the BOM first.

js2010
  • 23,033
  • 6
  • 64
  • 66