I've got several million lines of xml to parse. For one application I am looking to extract 3 pieces of data for use in other scripts.
The xml looks something like the following (several dozen tags have been removed per grouping) I can change one of the name tags if it helps; though not desirable it will require some intermediate processing. Not all node groups have the extended attributes.
<?xml version="1.0" encoding="IBM437"?>
<topo>
<node>
<name>device1Name</name>
<extendedAttributes>
<attribute>
<name>tagCategoryName</name>
<value>tagValue</value>
</attribute>
</extendedAttributes>
</node>
<node>
<name>device2Name</name>
<extendedAttributes>
<attribute>
<name>tagCategoryName</name>
<value>tagValue</value>
</attribute>
</extendedAttributes>
</node>
<node>
<name>device3Name</name>
</node>
...
...
</topo>
The output I am looking for each node is
deviceName tagCategoryName tagValue
I've attempted several approaches and have been unable to find an elegant solution. Started with
$xml = [xml](get-content prodnodes.txt)
Tried some Select-Xml with xpath, with direct $xml.topo.node addressing piping to select object using property names. I was unable to target the names effectively with the following.
$xml.topo.node | select-object -property name, extendedAttributes.attribute.name, extendedAttributes.attribute.value
It would return only the name The following worked to get me an additional attribute but I couldn't extend it without issues.
$munge = $xml.topo.node | select-object -property name, {$_.extendedAttributes.attribute.name}
Attempting to extend it looked like this
$munge = $xml.topo.node | select-object -property name, {$_.extendedAttributes.attribute.name, $_.extendedAttributes.attribute.value}
which gave output like this
deviceName1 {tagCategoryName1, tagValue1}
deviceName2 {tagCategoryName1, tagValue2}
deviceName3 {$null, $null}
deviceName4 {tagCategoryName2, tagValue3}
...
...
Is there a way to clean this up, or another approach that is more effective?