I am using PowerShell 2.0 on a Windows 7 desktop. I am attempting to search the enterprise CIFS shares for keywords/regex. I already have a simple single threaded script that will do this but a single keyword takes 19-22 hours. I have created a multithreaded script, first effort at multithreading, based on the article by Surly Admin.
Can Powershell Run Commands in Parallel?
Powershell Throttle Multi thread jobs via job completion
and the links related to those posts.
I decided to use runspaces rather than background jobs as the prevailing wisdom says this is more efficient. Problem is, is I am only getting partial resultant output with the multithreaded script I have. Not sure if it is an I/O thing or a memory thing, or something else. Hopefully someone here can help. Here is the code.
cls
Get-Date
Remove-Item C:\Users\user\Desktop\results.txt
$Throttle = 5 #threads
$ScriptBlock = {
Param (
$File
)
$KeywordInfo = Select-String -pattern KEYWORD -AllMatches -InputObject $File
$KeywordOut = New-Object PSObject -Property @{
Matches = $KeywordInfo.Matches
Path = $KeywordInfo.Path
}
Return $KeywordOut
}
$RunspacePool = [RunspaceFactory]::CreateRunspacePool(1, $Throttle)
$RunspacePool.Open()
$Jobs = @()
$Files = Get-ChildItem -recurse -erroraction silentlycontinue
ForEach ($File in $Files) {
$Job = [powershell]::Create().AddScript($ScriptBlock).AddArgument($File)
$Job.RunspacePool = $RunspacePool
$Jobs += New-Object PSObject -Property @{
File = $File
Pipe = $Job
Result = $Job.BeginInvoke()
}
}
Write-Host "Waiting.." -NoNewline
Do {
Write-Host "." -NoNewline
Start-Sleep -Seconds 1
} While ( $Jobs.Result.IsCompleted -contains $false)
Write-Host "All jobs completed!"
$Results = @()
ForEach ($Job in $Jobs) {
$Results += $Job.Pipe.EndInvoke($Job.Result)
$Job.Pipe.EndInvoke($Job.Result) | Where {$_.Path} | Format-List | Out-File -FilePath C:\Users\user\Desktop\results.txt -Append -Encoding UTF8 -Width 512
}
Invoke-Item C:\Users\user\Desktop\results.txt
Get-Date
This is the single threaded version I am using that works, including the regex I am using for socials.
cls
Get-Date
Remove-Item C:\Users\user\Desktop\results.txt
$files = Get-ChildItem -recurse -erroraction silentlycontinue
ForEach ($file in $files) {
Select-String -pattern '[sS][sS][nN]:*\s*\d{3}-*\d{2}-*\d{4}' -AllMatches -InputObject $file | Select-Object matches, path |
Format-List | Out-File -FilePath C:\Users\user\Desktop\results.tx -Append -Encoding UTF8 -Width 512
}
Get-Date
Invoke-Item C:\Users\user\Desktop\results.txt