1

I cannot see any end of line byte

echo "hello" | Format-Hex -Raw -Encoding Ascii

is there a way to show them?

Edit: I also have a file that shows the same behaviour, and this one contains multiple lines, as confirmed by both cat and notepad.

PS C:\dev\cur CMR-27473_AMI_not_stopping_in_ecat_fault 97984 > cat .\x.txt
helo
helo2
PS C:\dev\cur CMR-27473_AMI_not_stopping_in_ecat_fault 97984 > Get-Content .\x.txt | Format-Hex -Raw


           00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F

00000000   68 65 6C 6F                                      helo


           00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F

00000000   68 65 6C 6F 32                                   helo2

I do see the two records. But I want to see the end of line characters instead, that is, the raw bytes content.

mklement0
  • 382,024
  • 64
  • 607
  • 775
Stefano Borini
  • 138,652
  • 96
  • 297
  • 431

2 Answers2

4

If you mean newline, there isn't one in the source string. Thus, Format-Hex won't show one.

Windows uses CR LF sequence (0x0a, 0x0d) for newline. To see the control characters, append a newline into the string. Like so,

"hello"+[environment]::newline | Format-Hex -Raw -Encoding Ascii


           00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F

00000000   68 65 6C 6C 6F 0D 0A                             hello..

One can also use Powershell's backtick escape sequence: "hello`r`n" for the same effect as appending [Environment]::NewLine, though only the latter is platform-aware.

Addendum as per the comment and edit:

Powershell's Get-Content is trying to be smart. In most of the use cases[citation needed], data read from text files does not need to include the newline characters. Get-Content will populate an array and each line read from the file will be in its own element. What use would a newline be?

When output is redirected to a file, Powershell is trying to be smart again. In most of the use cases[citation needed], adding text into a text file means adding new lines of data. Not appending existing a line. There's actually a separate switch for preventing the linefeed: Add-Content -NoNewLine.

What's more, high level languages do not have a specific string termination character. When one has a string object, like the modern languages, the length of the string is stored as an attribute of the string object.

In low level languages, there is no concept of a string. It's just a bunch of characters stuffed together. How, then would one know where a "string" begins and ends? Pascal's approach is to allocate byte in the beginning to contain actual string data length. C uses null-terminated strings. In DOS, assembly programs used dollar -terminated strings.

vonPryz
  • 22,996
  • 7
  • 54
  • 65
3

To complement vonPryz's helpful answer:

tl;dr:

Format-Hex .\x.txt

is the only way to inspect a file's raw byte content in PowerShell; i.e., you need to pass the input file path as a direct argument (to the implied -Path parameter).

Once the pipeline is involved, any strings you're dealing with are by definition .NET string objects, which are inherently UTF-16-encoded.

echo "hello", which is really Write-Output "hello", given that echo is a built-in alias for Write-Output, writes a single string object to the pipeline, as-is - and given that it has no embedded newline, Format-Hex doesn't show one.

For more, read on.


  • Generally, PowerShell has no concept of sending raw data through a pipeline: you're always dealing with instances of .NET types (objects).

  • Therefore, when Format-Hex receives pipeline input, it never sees raw byte streams, it operates on .NET strings, which are inherently UTF-16 ("Unicode") strings.

    • It is only then that the -Encoding parameter applies: it re-encodes the .NET strings on output.

    • By default, the output encoding is ASCII in Windows PowerShell, and UTF-8 in PowerShell Core.
      Note: In Windows PowerShell, this means that by default characters outside the 7-bit ASCII range are transcoded in a "lossy" fashion to the literal ? character (whose Unicode code point and byte value is 0x3F).

    • The -Raw switch only make sense in combination with [int] (System.Int32)-typed input in Windows PowerShell v5.1 and is obsolete in PowerShell Core, where it has no effect whatsoever.[1]

  • echo is a built-in alias for the Write-Output cmdlet, and it accepts objects to write to the pipeline.

    • In your case, that object is a single-line string (an object of type [string] (System.String)), which, as stated, has no embedded newline sequence.
    • As an aside: PowerShell implicitly outputs anything that isn't captured (assigned to a variable or redirected elsewhere), so your command can be written more idiomatically as:

      "hello" | Format-Hex
      
  • Similarly, cat is a built-in alias for the Get-Content cmdlet, which reads a text file's content as an array of lines, i.e., into a string array whose elements do not end in a newline.

    • It is the array elements that are written to the pipeline, one by one, and Format-Hex renders the bytes of each separately - but, again, without any newlines, because the input objects (array elements representing lines without a trailing newline) do not contain any.

    • The only way to see newlines is to read the file as a whole, which is what the - somewhat confusingly named - -Raw switch does:

      Get-Content -Raw .\x.txt | Format-Hex
      

      While this now does reflect the actual newlines present in the file, note that it is not a raw byte representation of the file, for the reasons mentioned.


[1] -Raw's purpose up to v5.1 was never documented, but it is now described as obsolete and having no effect.
In short: [int]-typed input was not necessarily represented by the 4 bytes it comprises - single-byte or double-byte sequences were used, if the value was small enough, in favor of more compact output; -Raw would deactivate this and output the faithful 4-byte representation.
In PS Core [v6+], you now always and invariably get the faithful byte representation, and -Raw has no effect; for the full story see this GitHub pull request.

mklement0
  • 382,024
  • 64
  • 607
  • 775