I've been given the task of searching for SSNs (and other PII so we can remove it) in our entire file structure, fun I know. So far this script will search thru all .xlsx files in a given directory, but no matter what I try, I cannot for the life of me get the $SearchText
variable to work. I have tried so many different deviations of the regex currently displayed, the only regex string that works is straight question marks; "???????????"
, but that returns entires I'm not looking for.
Any help would be very much appreciated.
Thanks!
$SourceLocation = "C:\Users\nick\Documents\ScriptingTest"
$SearchText2 = "^(?!(000|666|9))\d{3}-(?!00)\d{2}-(?!0000)\d{4}$"
$SearchText = "*"
$FileNames = Get-ChildItem -Path $SourceLocation -Recurse -Include *.xlsx
Function Search-Excel {
$Excel = New-Object -ComObject Excel.Application
$Workbook = $Excel.Workbooks.Open($File)
ForEach ($Worksheet in @($Workbook.Sheets)) {
$Found = $WorkSheet.Cells.Find($SearchText)
If ($Found.Text -match "SearchText2") {
$BeginAddress = $Found.Address(0,0,1,1)
[pscustomobject]@{
WorkSheet = $Worksheet.Name
Column = $Found.Column
Row =$Found.Row
Text = $Found.Text
Address = $File
}
Do {
$Found = $WorkSheet.Cells.FindNext($Found)
$Address = $Found.Address(0,0,1,1)
If ($Address -eq $BeginAddress) {
BREAK
}
[pscustomobject]@{
WorkSheet = $Worksheet.Name
Column = $Found.Column
Row =$Found.Row
Text = $Found.Text
Address = $File
}
} Until ($False)
}
}
}
$workbook.close($false)
[void][System.Runtime.InteropServices.Marshal]::ReleaseComObject([System.__ComObject]$excel)
[gc]::Collect()
[gc]::WaitForPendingFinalizers()
Remove-Variable excel -ErrorAction SilentlyContinue
foreach ($File in $FileNames)
{
Search-Excel
}
EDIT: Turns out excel has a very limited range of acceptable regex: Acceptable Excel Regex, so I modified the first $Searchtext viarable to just be "*", and the first if statement to match regex outside of excel's search. Now I just need to come up with a crafty regex pattern to filter what I want. The next problem is filtering:
- No letters.
- Valid SSNs with dashes.
- Valid SSNs without dashes. (this part is stumping me, how to search for something that can have dashes, but if it doesn't, it can only be 9 characters long)