0

I made a script to compare a downloaded copy of a file with the remote version via sha1 checksums to see if they match (to verify downloaded, check for changes, etc).

<?php
// $remote and $local are paths, one http and one local file
$local_sha1 = sha1_file($local, true);
$remote_sha1 = sha1_file($remote, true);
if($local_sha1 == $remote_sha1){
  echo "Match\n";
} else {
  echo "Mismatch\n";
}
// This says Mismatch every time.
?>

I downloaded the file again (via browser) and overwrote the local copy. Still mismatch.

For further testing:

<?php
$local_string = @file_get_contents($local);
$remote_string = @file_get_contents($remote);
strlen($local_string) == strlen($remote_string); // always true
$local_string == $remote_string;                 // always false

substr($local_string, $x, $l) == substr($remote_string, $x, $l);
// always true for any values of $x & $l, including negative values for $x
?>

I don't get it. Do you see something I'm missing? What other factor could affect the results?

Psudo
  • 11
  • 2
  • 1
    If you remove the `@` symbols (which hide errors) from the file_get_contents version (and [enable error reporting](https://stackoverflow.com/questions/1053424/how-do-i-get-php-errors-to-display?rq=1) if you haven’t already done so), do you see any errors? – rickdenhaan Feb 11 '22 at 11:37
  • 1
    Your second snippet answers the question you ask in the title. Different strings will produce different SHA1 hashes—that's the point. – Álvaro González Feb 11 '22 at 11:41
  • @rickdenhaan - I added the following lines of code at the top to ensure error reporting was on & removed the @ signs. No errors. ini_set('display_errors', 1); ini_set('display_startup_errors', 1); error_reporting(E_ALL); It also comes up with a different hash for the remote file every time I run it. The file: https://jamanetwork.com/journals/jam/articlepdf/2778361/jama_woolf_2021_ld_210023_1620138060.73422.pdf Álvaro González - The thing is, the string comparison says it mismatches but running it through substr in a way that returns the entire string unaltered says they do match. – Psudo Feb 11 '22 at 20:26
  • @Álvaro González - You are correct. My code comparing the string in parts didn't work right. The file downloads with a slight difference each time (I think a timestamp for when it was downloaded), which screws up the checksum. Sorry I doubted, and good work! – Psudo Feb 11 '22 at 21:38

0 Answers0