There are helpful regexes in the existing answers; using one with the -replace
operator allows you to extract the information of interest in a single operation:
$line = '<outputColumn id="426" name="Net Salary per month € (3rd Applicant)" description="" lineageId="426" precision="0" scale="0" length="255" dataType="wstr" codePage="0" sortKeyPosition="0" comparisonFlags="0" specialFlags="0" errorOrTruncationOperation="Conversion" errorRowDisposition="FailComponent" truncationRowDisposition="FailComponent" externalMetadataColumnId="425" mappedColumnId="0"/>'
# Extract the "name" attribute value.
# Note how the regex is designed to match the *full line*, which is then
# replaced with what the first (and only) capture group, (...), matched, $1
$line -replace '^.+ name="([^"]*).+', '$1'
This outputs a string with verbatim content Net Salary per month € (3rd Applicant)
.
Taking a step back: Your sample line is a valid XML element, and it's always preferable to use a dedicated XML parser.
Parsing each line as XML will be slow, but perhaps you can parse the entire file, which offers a simple solution using PowerShell's property-based adaption of the XML DOM, via the [xml]
type (System.Xml.XmlDocument
):
$fileContent = @'
<xml>
<outputColumn id="426" name="Net Salary per month € (3rd Applicant)" description="" lineageId="426" precision="0" scale="0" length="255" dataType="wstr" codePage="0" sortKeyPosition="0" comparisonFlags="0" specialFlags="0" errorOrTruncationOperation="Conversion" errorRowDisposition="FailComponent" truncationRowDisposition="FailComponent" externalMetadataColumnId="425" mappedColumnId="0"/>
<outputColumn id="427" name="Net Salary per month € (4th Applicant)" description="" lineageId="426" precision="0" scale="0" length="255" dataType="wstr" codePage="0" sortKeyPosition="0" comparisonFlags="0" specialFlags="0" errorOrTruncationOperation="Conversion" errorRowDisposition="FailComponent" truncationRowDisposition="FailComponent" externalMetadataColumnId="425" mappedColumnId="0"/>
</xml>
'@
([xml] $fileContent).xml.outputColumn.name
The above yields the "name"
attribute values across all <outputColumn>
elements:
Net Salary per month € (3rd Applicant)
Net Salary per month € (4th Applicant)