2

My mission is to compare a list of hundreds of files in Windows. In each comparison I have to compare a pair of files. The files could be binaries or text files (all kinds)

I'm looking for the fastest time to run this. I have to check if the content is the same or not (I have nothing to do with the content itself - I have to report == or != ).

What could be the fastest way to do so? fc.exe? something else?

If fc.exe is the answer, are there any parameters that should accelerate response time?

I'd prefer to use an EXE that's part of standard Windows installation (but it's not a must).

THANK YOU

Mark
  • 3,609
  • 1
  • 22
  • 33
  • May depend on the filelengths involved, but `FC` would seem to be the way to go. Now - how is the list of filenames presented? – Magoo Jan 23 '14 at 15:46
  • Files' length could be misleading since the content could be different but the file size is just the same. I run fc.exe file1 file2 on each set of files, separately. –  Jan 23 '14 at 16:46
  • As may be. Now - how are the filenames presented? – Magoo Jan 23 '14 at 16:53
  • You could use file length as an initial filter - two files that have different length can not have the same contents, so there's no need to check such a pair. – MicroVirus Jan 23 '14 at 17:39
  • @Magoo: what do you mean 'presented'? They could be text or binaries –  Jan 23 '14 at 18:33
  • @MicroVirus: What's the most recommended way to get files' length? I'd prefer to get sizes of two files in one execution (to save time) –  Jan 23 '14 at 18:35
  • @user1762109: I don't know the most recommended way, but assuming you want to work with batch-files you could check http://stackoverflow.com/questions/1199645/how-can-i-check-the-size-of-a-file-in-a-windows-batch-script – MicroVirus Jan 23 '14 at 22:56

3 Answers3

2

I'm assuming you want to do a binary comparison.

I would use the following to compare two files:

fc "file1" "file2" /b >nul && echo "file1" == "file2" || "file1" != "file2"

EDIT

If you have many very large files to compare, it may be worth while comparing file sizes before using FC to compare the entire file. I used the same indicator variable so that I could define the actions to take upon result of "same" or "different" just once, without resorting to CALLed subroutines. A CALL is relatively slow.

set "same="
for %%A in ("file1") do for %%B in ("file2") do (
  if %%~zA equ %%~zB fc %%A %%B /b >nul && set "same=1"
)
if defined same (
  echo "file1" == "file2"
) else (
  echo "file1" != "file2"
)
dbenham
  • 127,446
  • 28
  • 251
  • 390
0

You can get a CRC hex string of each file using a third party command line tool and compare the hex strings.

Depending on how you are comparing sets of files, when using this method then each file only needs to be read once.

foxidrive
  • 40,353
  • 10
  • 53
  • 68
0

@foxidrive answer has merit, especially when the files are big and on the same physical drive which, unless the comparing software slurps big file chunks at a time causes disk thrashing.

Two utils available are

Inbuilt (for windows 10 at least) C:\Windows\System32\certutil.exe This has a slightly awkward (enormous) parameter set but works well, does many more useful conversions and can use a large number of hash algorithms (sorry I lost the link to the full list of algorithms available ); it definitely works with sha1 sha256 sha384 md5. Ex. certutil -hashfile "E:\huge.7z" md5

and the Microsoft utility fciv (md5 generator by default) Google for "Microsoft File Checksum Integrity Verifier" and download from Microsoft Ex. fciv bigFile.7z

I know this is an old question, but I haven't seen these two utilities mentioned much... Hope it helps someone.

Phil Mundy
  • 11
  • 1