LotPings' answer sensibly recommends using a regular expression with capture groups to extract the substrings of interest from each matching line.
You can incorporate that into your Select-String
call for a single-pipeline solution (the assumption is that the XML comments of interest are all on a single line each):
# Define the regex to use with Select-String, which both
# matches the lines of interest and captures the substrings of interest
# ('id' an 'instance' attributes) via capture groups, (...)
$regex = '<!--<__AMAZONSITE id="(.+?)" instance ="(.+?)"__/>-->'
Select-String -LiteralPath $xmlfile -Pattern $regex | ForEach-Object {
# Output a custom object with properties reflecting
# the substrings of interest reported by the capture groups.
[pscustomobject] @{
id = $_.Matches.Groups[1].Value
instance = $_.Matches.Groups[2].Value
}
}
The result is an array of custom objects that each have an .id
and .instance
property with the values of interest (which is preferable to setting individual variables); in the console, the output would look something like this:
id instance
-- --------
-123456780 CATZ00124
-123456781 CATZ00125
-123456782 CATZ00126
As for what you tried:
Note: I'm discussing your use of .Split()
, though for extracting a substring, as is your intent, .Split()
is not the best tool, given that it is only the first step toward isolating the substring of interest.
As LotPings notes in a comment, in Windows PowerShell, $commentedline.Line.Split('id=')
causes the String.Split()
method to split the input string by any of the individual characters in split string 'id='
, because the method overload that Windows PowerShell selects takes a char[]
value, i.e. an array of characters, which is not your intent.
You could rectify this as follows, by forcing use of the overload that accepts string[]
(even though you're only passing one string), which also requires passing an options argument:
$commentedline.Line.Split([string[] 'id=', 'None') # OK, splits by whole string
Note that in PowerShell Core the logic is reversed, because .NET Core introduced a new overload with just [string]
(with an optional options argument), which PowerShell Core selects by default. Conversely, this means that if you do want by-any-character splitting in PowerShell Core, you must cast the split string to [char[]]
.
On a general note, PowerShell has the -split
operator, which is regex-based and offers much more flexibility than String.Split()
- see this answer.
Applied to your case:
$commentedline.Line -split 'id='
While id=
is interpreted a regex by -split
, that makes no difference here, given that the string contains no regex metacharacters (characters with special meaning); if you do want to safely split by a literal substring, use [regex]::Escape('...')
as the RHS.
Note that -split
is case-insensitive by default, as PowerShell generally is; however, you can use the -csplit
variant for case-sensitive matching.