0

I have two folders with same file names inside. How can I do a checksum in bash to evaluate which files are different or identical? Currently, I've written a bash script below which builds corresponding file names and does "cksum" on them, which generates two numbers per checksum. But I have to somehow save these two numbers for each record and subtract them to see which one's a non-match.

#!/bin/bash
folderOld="home/OldFiles/"
folderNew="home/NewFiles/"
for ((fileNumber=1;fileNumber<1000000;fileNumber++))
do
FileName="file${fileNumber}.dat"
OldFile=$folderOld$FileName
NewFile=$folderNew$FileName
cksum $OldFile
cksum $NewFile
done
darkblue80
  • 35
  • 1
  • 6
  • 1
    You might be better off using `diff -q home/OldFiles home/NewFiles` to find the files that are different. See [Given two directory trees, how can I find out which files differ by content?](https://stackoverflow.com/q/4997693/4154375). Comparing checksums is slower and more complicated. Also, `cksum` checksums are *far* too short to be useful in comparing large numbers of files. – pjh Mar 16 '22 at 17:30

3 Answers3

1

There is no need to compute checksums if all you need to know is if files differs. Use diff:

#!/usr/bin/env bash

folderOld="home/OldFiles/"
folderNew="home/NewFiles/"
for ((fileNumber = 1; fileNumber < 1000000; fileNumber++)); do
  FileName="file${fileNumber}.dat"
  if diff -q "$folderOld$FileName" "$folderNew$FileName" >/dev/null; then
    printf 'File %s is same in %s and %s\n' "$FileName" "$folderOld" "$folderNew"
  else
    printf 'File %s differs in %s and %s\n' "$FileName" "$folderOld" "$folderNew"
  fi
done
Léa Gris
  • 17,497
  • 4
  • 32
  • 41
1
for i in {1..999999}; do
    cmp {old/dir,new/dir}"/file$i.dat" || echo "file$i.dat: no match"
done
dan
  • 4,846
  • 6
  • 15
0

You don't need to subtract the numbers, you just need to check whether they are equal:

OldSum=$(cksum $OldFile | cut -d' ' -f1)
NewSum=$(cksum $NewFile | cut -d' ' -f1)
if [[ $OldSum != $NewSum ]]; then
    # Checksum mismatch, do something useful here
fi

I'm using cut -d' ' -f1 here to split the line on spaces, and take the first field. So file size and file name are ignored.

By the way, cksum uses CRC32 by default, which has a fair risk of false negatives on so many files. Better to use, for example, sha256sum.

Thomas
  • 174,939
  • 50
  • 355
  • 478