Environment is Windows Server 2016. So I have a folder with 290k subfolders. Those subfolders need to be moved. I also have a CSV file with two fields, one is the (numeric) folder name, the other one is the destination. The problem: the CSV contains additional rows, not only the 290k I need, but about 5.3 million rows.
Structure of the CSV is basically:
"ORDER","CLIENT"
11223344,1111
22334455,1111
33445566,2222
44556677,1390
To clarify, every folder name is a ORDER. The order folders shall be sorted into CLIENT folders. Orders are unique. Not every order has a folder.
My first approach was to load the CSV, then foreach through it. It was utterly slow since I had to check every entry if it was actually an existing folder.
$rootfolder = path-to-folders
$csvpath = path-to-csv
$csv = Import-csv -path $csvpath
foreach($row in $csv)
{
$path = $rootfolder + $row.ORDER
$target = $row.CLIENT
if (Test-Path -Path $path) {
# copy files to $rootfolder + $target
}
}
Second approach was getting a list of the subfolders first, walk through that and look up the corresponding value with where-object from the CSV. That turned out even slower.
$rootfolder = path-to-folders
$csvpath = path-to-csv
$folders = Get-ChildItem -path $rootfolder
$csv = Import-csv -path $csvpath
foreach ($folder in $folders) {
$currentpath = $rootfolder + $folder
if (Test-Path -Path $currentpath -PathType Container) {
$target = $csv | Where-Object ORDER -eq $folder
# copy files to $rootfolder + $target
}
}
My next idea would be to load that CSV in a temporary database and do the lookup from there since it should (?) be way faster.
But maybe there are more elaborated ways compared to my caveman approach? Thanks for reading.