0

I'm trying to read data from an XML into Powershell.

<?xml version="1.0" encoding="UTF-8"?>

<?xml-stylesheet type="text/xsl" href="tmp.xslt"?>
<Book>
   <Subbook id="1848">
        <variable Name="Metadata_Code" Value="A-"/>
        <variable Name="Metadata_Installation" Value="pod - A"/>
        <Topic...>
        <Topic....>
   </Subbook>
    <Subbook id="1849">
        <variable Name="Metadata_Code" Value="B-"/>
        <variable Name="Metadata_Installation" Value="pod - B"/>
        <Topic...>
        <Topic....>
    </Subbook>
</Book>

I want to loop through all Subbook elements. Each Subbook element contains a few variable elements. In addition it contains other elements I have to process. I can set up the loop.

[xml]$xmlfile = Get-Content -Path $xmlpath
$subbooks = Select-Xml -Path $xmlpath -XPath '//Subbook'
$subbooks | ForEach-Object {
    $MDC = $_.Node.SelectNodes('variable[@Name="Metadata_Code"]/@Value')
}

But then I'm stuck. I need to select the value of the variable node that has name 'Metadata_Code'. In XSLT this is variable[@Name='Metadata_Code']/@Value

$MDC remains empty. How can I get the value I need into this variable?

Hobbes
  • 1,964
  • 3
  • 18
  • 35
  • In your example `$xmlfile ` is an `XmlDocument` and you can browse the contents like this: `$xmlObject.Book.Subbook`, which will give you all the 'subbook' nodes. You can feed this to `ForEach-Object` to further process. e.g. `$xmlObject.Book.Subbook | ForEach-Object { ... }` – boxdog Jun 08 '23 at 14:28
  • Yes, but that's the part that's already working. – Hobbes Jun 08 '23 at 14:31
  • if you want `A-` and `B-` you can do filtering with `Where-Object` its really simple if you use powershell for filtering – Santiago Squarzon Jun 08 '23 at 14:32

3 Answers3

1

Assuming what you're looking for in your example XML is A- and B-, if you want to use PowerShell to perform the filtering, the code would look like this:

($xmlfile = [xml]::new()).Load($xmlpath)
$xmlfile.Book.Subbook.variable |
    Where-Object Name -EQ 'Metadata_Code' |
    ForEach-Object Value
Santiago Squarzon
  • 41,465
  • 5
  • 14
  • 37
  • My post wasn't clear enough. I have to process the `` elements, which contain a bunch of other stuff in addition to the `` elements. This means I can't just loop over the variables. – Hobbes Jun 08 '23 at 14:49
  • 1
    I dont understand, if you want to manipulate the objects having `Name=Metadata_Code` then remove the `ForEach-Object Value` part of the code an use dot notation instead on this objects @Hobbes – Santiago Squarzon Jun 08 '23 at 15:02
1

Santiago's helpful answer shows you how to solve your problem using PowerShell's dot-notation-based adaptation of the [xml] DOM.

As for your XPath-based attempt:

Update, based on your own answer:

tl;dr: Your code works, but a display bug prevented you from realizing that. Details below.

  • $MDC wasn't actually empty, but the way you tried to visualize the System.Xml.XmlNodeList instance it contained (returned by .SelectNodes()), namely via Write-Host resulted in no display output.

    • While this should be considered a bug - see GitHub issue #19769 - it is rarely helpful to use Write-Host to display complex objects.

    • In fact, Write-Host is typically the wrong tool to use, unless the intent is to write to the display only, bypassing the success output stream and with it the ability to send output to other commands, capture it in a variable, or redirect it to a file. To output a value, use it by itself; e.g, $MDC, instead of Write-Host $MDC (or use Write-Output $MDC); see this answer. To explicitly print only to the display but with rich formatting, use Out-Host.

  • Additionally - as also shown in the code below - you needed to access the .Value property of the instance System.Xml.XmlAttribute in $MDC in order to get only the attribute's value (text).

  • As an aside: If you're looking for just a single node using an identifying attribute, use .SelectSingleNode() instead of .SelectNodes()

Aside from that, your code can be simplified. Here's a self-contained example with your sample XML:

# Create sample XML file
@'
<?xml version="1.0" encoding="UTF-8"?>

<?xml-stylesheet type="text/xsl" href="tmp.xslt"?>
<Book>
<Subbook id="1848">
<variable Name="Metadata_Code" Value="A-"/>
<variable Name="Metadata_Installation" Value="pod - A"/>
</Subbook>
<Subbook id="1849">
<variable Name="Metadata_Code" Value="B-"/>
<variable Name="Metadata_Installation" Value="pod - B"/>
</Subbook>
</Book>
'@ > file.xml

# Let Select-Xml parse the file directly
# and use a single XPath query to extract the target nodes.
Get-Item file.xml | 
  Select-Xml '//Subbook/variable[@Name="Metadata_Code"]/@Value' | 
  ForEach-Object { $_.Node.Value }

Output (prepend something like $MDC = to the pipeline above to capture the output strings in a variable):

A-
B-
mklement0
  • 382,024
  • 64
  • 607
  • 775
0

The problem is that

$MDC = $_.Node.SelectNodes('variable[@Name="Metadata_Code"]/@Value')

outputs a node-set. I was using Write-Host $MDC to see the contents, this displays nothing when the variable content is a nodeset.

This gives me the value I was looking for:

$MDC = $_.Node.SelectNodes('variable[@Name="Metadata_Code"]/@Value').Value
Hobbes
  • 1,964
  • 3
  • 18
  • 35
  • The tl;dr is: your code worked, but a display bug in `Write-Host` (which you're not using in the code in your question) prevented you from realizing that. I've updated my answer with what I hope is a clearer framing of the issue, along with a link to the relevant bug report and advice to generally avoid `Write-Host`. – mklement0 Jun 08 '23 at 16:07