0

I'm quite new to Powershell so please excuse my inexperience or lack of best practices. I am writing a script that will first store the MD5 hashes of two different directories in a CSV format. I then want to compare the two files and if there are any changes in the two CSV files (i.e the MD5 hash of a file doesn't match or one file in the source folder's CSV file is absent from the destination folder's CSV file), I want to generate a new list with the details of the mismatched or absent files.

In addition to this, I was wondering if there is a faster way of comparing the two files because the folder that I am dealing with is quite large and has about 12k files in it. I was thinking that maybe I'd try to keep the source CSV intact and remove the entries from the destination CSV one by one as they are checked and verified. Can anyone please help me with this?

#Getting the MD5 hash of the Installer and storing it a csv format
$SourcePath = Get-ChildItem -Path C:\source -Recurse
$SourcerHash = foreach ($File in $SourcePath) 
{
    Get-FileHash $File.FullName -Algorithm MD5
}
$SourceHash | Export-Csv -Path C:\Users\abcd\Desktop\CSVExports\SourceHash.csv

#Getting the MD5 hash of the destination directory and storing it in a csv format
$DestinationPath = Get-ChildItem -Path C:\destination -Recurse
$DestinationHash = foreach ($File in $DestinationPath) 
{
    Get-FileHash $File.FullName -Algorithm MD5    
}
$DestinationHash | Export-Csv -Path C:\Users\abcd\Desktop\CSVExports\DestinationHash.csv

#Comparing the hashes of Installer and Destination directories
Compare-Object -ReferenceObject (Import-Csv C:\Users\abcd\Desktop\CSVExports\InstallerHash.csv) -DifferenceObject (Import-Csv C:\Users\abcd\Desktop\CSVExports\DestinationHash.csv) -Property Hash | Export-Csv C:\Users\abcd\Desktop\CSVExports\ResultTable
Aithorusa
  • 113
  • 2
  • 10
  • 2
    "*...is a faster way of comparing the two files...*", see: [Powershell Speed: How to speed up ForEach-Object MD5/hash check](https://stackoverflow.com/a/59916692/1701026) – iRon Dec 28 '20 at 09:28
  • 1
    You are using the wrong variable name after first for each. InstallerHash instead of SourceHash. – Smorkster Dec 28 '20 at 10:27

2 Answers2

0
param(
   $firstDirectoryName = "D:\tmp\001",
   $SecondDirectoryName = "D:\tmp\002"
)
$firstList = Get-ChildItem $firstDirectoryName -File -Recurse | ForEach-Object {
   [PSCustomObject]@{
      relativePath =  $_.FullName.TrimStart($firstDirectoryName)
      hash = (Get-FileHash $_.FullName -Algorithm MD5).Hash
   }
}
$secondList = Get-ChildItem $SecondDirectoryName -File -Recurse | ForEach-Object {
   [PSCustomObject]@{
      relativePath =  $_.FullName.TrimStart($SecondDirectoryName)
      hash = (Get-FileHash $_.FullName -Algorithm MD5).Hash
   }
}

Compare-Object -ReferenceObject  $firstList -DifferenceObject $secondList -Property relativePath, hash


  • Suggestion: repllace FullName.TrimStart($firstDirectoryName) with FullName.Substring($firstDirectoryName.Length). TrimStart will remove all occurrences of the characters from $firstDirectoryName from the whole FullName string, and what we want to do is only remove the common part of the path, hence the Substring proposal. – Piotr Tyburski May 10 '23 at 09:51
-2

Can you explain the use case of this exercise a bit more? I looks like some other tool would be much efficient at this, like a proper file sync tool, than trying to do this in PowerShell.

Anyways, since you asked:

  • How often do you need to check the hashes?
  • Maybe just check the files that are updated recently, instead of checking all 12k files?
  • Posh v7 supports foreach parallel, which can speed up the process.
michiel Thai
  • 547
  • 5
  • 16
  • It's an installer and I need to compare the hashes of the source with the destination directory. 1. I only need to check the hashes once after the installer runs 2. Since it's going to be a fresh installation, I will have to run it for all 12k files the first time. – Aithorusa Dec 28 '20 at 11:06
  • Can't you zip the two folders and compare the hashes of the zip files instead? – michiel Thai Dec 28 '20 at 11:28