I have this script working, but with 100k+ rows in File1 and 200k+ in file 2, it will take days to complete. I got the where.({ part down to less than a second, with both csv files as data tables, but with that route I can't get the data out the way I want. This script outputs the data the way I want, but it takes 4 seconds per lookup. What can I do to speed this up?
I thought containskey somewhere might help, but on PRACT_ID there is a one to many relationship, so not sure how to handle those? Thx.
Invoke-Expression "C:\SHC\MSO\DataTable\functionlibrary.ps1"
[System.Data.DataTable]$Script:MappingTable = New-Object System.Data.DataTable
$File1 = Import-csv "C:\File1.csv" -Delimiter '|' | Sort-Object PRACT_ID
$File2 = Get-Content "C:\File2.csv" | Select-Object -Skip 1 | Sort-Object
$Script:MappingTable = $File1 | Out-DataTable
$Logs = "C:\Testing1.7.csv"
[System.Object]$UserOutput = @()
foreach ($name in $File1) {
[string]$userMatch = $File2.Where( { $_.Split("|")[0] -eq $name.PRACT_ID })
if ($userMatch) {
# Process the data
$UserOutput += New-Object PsObject -property @{
ID_NUMBER = $name.ID_NUMBER
PRACT_ID = $name.PRACT_ID
LAST_NAME = $name.LAST_NAME
FIRST_NAME = $name.FIRST_NAME
MIDDLE_INITIAL = $name.MIDDLE_INITIAL
DEGREE = $name.DEGREE
EMAILADDRESS = $name.EMAILADDRESS
PRIMARY_CLINIC_PHONE = $name.PRIMARY_CLINIC_PHONE
SPECIALTY_NAME = $name.SPECIALTY_NAME
State_License = $name.State_License
NPI_Number = $name.NPI_Number
'University Affiliation' = $name.'University Affiliation'
Teaching_Title = $name.Teaching_Title
FACILITY = $userMatch
}
}
}
$UserOutput | Select-Object ID_NUMBER, PRACT_ID, LAST_NAME, FIRST_NAME, MIDDLE_INITIAL, DEGREE, EMAILADDRESS, PRIMARY_CLINIC_PHONE, SPECIALTY_NAME, State_License, NPI_Number, 'University Affiliation', Teaching_Title, FACILITY |
Export-Csv $logs -NoTypeInformation