I am trying to return the highest 4 digit number found in string pattern, in a set of documents.
String Pattern: 3 Letters dash 4 Digits
The word documents contain within them a document identifier code such as below.
Sample Files:
Car Parts.docx > CPW - 2345
CarHandles.docx > CPW - 8723
CarList.docx > CPA - 9083
I have referenced sample code that I am trying to adapt. I am not a VBA or powershell programmer - so I may be wrong in what I am trying to do?
I am happy to look at alternatives - on a Windows platform.
I have referenced this to get me started
http://chris-nullpayload.rhcloud.com/2012/07/find-and-replace-string-in-all-docx-files-recursively/
PowerShell: return the number of instances find in a file for a search pattern
Powershell: return filename with highest number
$list = gci "C:\Users\WP\Desktop\SearchFiles" -Include *.docx -Force -recurse
foreach ($foo in $list) {
$objWord = New-Object -ComObject word.application
$objWord.Visible = $False
$objDoc = $objWord.Documents.Open("$foo")
$objSelection = $objWord.Selection
$Pat1 = [regex]'[A-Z]{3}-[0-9]{4}' # Find the regex match 3 letters followed by 4 numbers eg HGW - 1024
$findtext= "$Pat1"
$highestNumber =
# Find the highest occurrence of this pattern found in the documents searched - output to text file or on screen
Sort-Object | # This may also be wrong -I added it for when I find the pattern
Select-Object -Last 1 -ExpandProperty Name
<# The below may not be needed - ?
$ReplaceText = ""
$ReplaceAll = 2
$FindContinue = 1
$MatchFuzzy = $False
$MatchCase = $False
$MatchPhrase = $false
$MatchWholeWord = $True
$MatchWildcards = $True
$MatchSoundsLike = $False
$MatchAllWordForms = $False
$Forward = $True
$Wrap = $FindContinue
$Format = $False
$objSelection.Find.execute(
$FindText,
$MatchCase,
$MatchWholeWord,
$MatchWildcards,
$MatchSoundsLike,
$MatchAllWordForms,
$Forward,
$Wrap,
$Format,
$ReplaceText,
$ReplaceAll
}
}
#>
I appreciate any advice on how to proceed -