1

These two files are clearly different. But Compare-Object finds no difference. The first file has a Unicode BOM. Is there any way to get Compare-Object to identify that they are different?

PS C:\Temp> dir file*

    Directory: C:\Temp

Mode                 LastWriteTime         Length Name
----                 -------------         ------ ----
-a---          2021-06-01    14:38         163673 file1.txt
-a---          2021-03-18    17:08         163670 file2.txt

PS C:\Temp> Compare-Object -ReferenceObject (Get-Content -Path .\file2.txt) -DifferenceObject (Get-Content -Path .\file2.txt)
PS C:\Temp> Format-Hex -Path .\file1.txt | Select-Object -First 1

   Label: C:\Temp\file1.txt

          Offset Bytes                                           Ascii
                 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
          ------ ----------------------------------------------- -----
0000000000000000 ***EF BB BF*** 3C 3F 78 6D 6C 20 76 65 72 73 69 6F 6E <?xml version

PS C:\Temp> Format-Hex -Path .\file2.txt | Select-Object -First 1

   Label: C:\Temp\file2.txt

          Offset Bytes                                           Ascii
                 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F
          ------ ----------------------------------------------- -----
0000000000000000 3C 3F 78 6D 6C 20 76 65 72 73 69 6F 6E 3D 22 31 <?xml version="1

PS C:\Temp> $PSVersionTable.PSVersion.ToString()
7.1.3
lit
  • 14,456
  • 10
  • 65
  • 119
  • I get the same results in PS5.1. You could do a comparison between the bytes as a workaround but I guess that's not what you're looking for. – Santiago Squarzon Jun 01 '21 at 20:40

3 Answers3

4

The two files are different, but the strings produced by Get-Content are identical.

Use the -Encoding parameter to make Get-Content read the raw byte values:

Compare-Object (Get-Content .\file2.txt -Encoding Byte) (Get-Content .\file1.txt -Encoding Byte)

You might find Get-FileHash more efficient for detecting byte-level equality of two (or more) files:

PS ~> Get-FileHash *.txt
    
Algorithm       Hash                                                                   Path
---------       ----                                                                   ----
SHA256          F1945CD6C19E56B3C1C78943EF5EC18116907A4CA1EFC40A57D48AB1DB7ADFC5       C:\path\to\file1.txt
SHA256          E3B0C44298FC1C149AFBF4C8996FB92427AE41E4649B934CA495991B7852B855       C:\path\to\file2.txt
Mathias R. Jessen
  • 157,619
  • 12
  • 148
  • 206
  • `Byte` is not a valid encoding. – lit Jun 02 '21 at 01:20
  • 1
    `Get-Content: Cannot process argument transformation on parameter 'Encoding'. 'Byte' is not a supported encoding name.` – lit Jun 02 '21 at 01:26
  • @lit: Unfortunately, `-Encoding Byte` was removed from PowerShell (Core) v6+ in favor of the new `-AsByteStream` switch - a frivolous breaking change discussed in [GitHub issue #7986](https://github.com/PowerShell/PowerShell/issues/7986) – mklement0 Jun 20 '21 at 22:21
1

Using Get-Content -AsByteStream will indicate differences.

PS C:\Temp> Compare-Object -ReferenceObject (Get-Content -Path .\file1.txt -AsByteStream) `
    -DifferenceObject (Get-Content -Path .\file2.txt -AsByteStream)

InputObject SideIndicator
----------- -------------
        239 <=
        187 <=
        191 <=

PS C:\Temp> $null -eq (Compare-Object -ReferenceObject (Get-Content -Path .\file1.txt -AsByteStream) `
    -DifferenceObject (Get-Content -Path .\file1.txt -AsByteStream))
True
PS C:\Temp> $null -eq (Compare-Object -ReferenceObject (Get-Content -Path .\file1.txt -AsByteStream) `
    -DifferenceObject (Get-Content -Path .\file2.txt -AsByteStream))
False
lit
  • 14,456
  • 10
  • 65
  • 119
0

Quick answer: Yes, Compare-Object can do a binary compare.

BUT: It's hopelessly slow and will crash on large files.

Better to use a buffered approach written by Kees Bakker. Or, if you prefer a native function, FC, which is slower, but not terribly slow for a small job.

See a comparison of all those methods in Speed of binary file comparisons in PowerShell

NewSites
  • 1,402
  • 2
  • 11
  • 26