tl;dr
As you suspected, a name collision prevented prevented access to the .Item
property on the XML elements of interest; fix the problem with explicit enumeration of the parent elements:
$xml.PatchScan.Machine.Product |
% { $_.Item | select BulletinId, PatchName, Status }
%
is a built-in alias for the ForEach-Object
cmdlet; see bottom section for an explanation.
As an alternative, Ansgar Wiecher's helpful answer offers a concise XPath-based solution, which is both efficient and allows sophisticated queries.
As an aside: PowerShell v3+ comes with the Select-Xml
cmdlet, which takes a file path as an argument, allowing for a single-pipeline solution:
(Select-Xml -LiteralPath X:\folder\my.xml '//Product/Item[@Class="Patch"]').Node |
Select-Object BulletinId, PatchName, Status
Note:
Select-Xml
wraps the matching XML nodes in an outer object, hence the need to access the .Node
property.
As with direct use of the .NET APIs, querying XML documents that have namespaces requires extra work, namely declaring a (hash)table of namespace prefixes that map to namespace URIs, and use of those prefixes in the XPath query - see this answer
PowerShell's adaptation of the XML DOM (dot notation):
PowerShell decorates the object hierarchy contained in [System.Xml.XmlDocument]
instances (created with cast [xml]
, for instance):
with properties named for the input document's specific elements and attributes[1] at every level; e.g.:
([xml] '<foo><bar>baz</bar></foo>').foo.bar # -> 'baz'
([xml] '<foo><bar id="1" /></foo>').foo.bar.id # -> '1'
turning multiple elements of the same name at a given hierarchy level implicitly into arrays (specifically, of type [object[]]
); e.g.:
([xml] '<foo><C>one</C><C>two</C></foo>').foo.C[1] # -> 'two'
As the examples (and your own code in the question) show, this allows for access via convenient dot notation.
Note: If you use dot notation to target an element that has at least one attribute and/or child elements, the element itself is returned (an XmlElement
instance); otherwise, it is the element's text content; for information about updating XML documents via dot notation, see this answer.
The downside of dot notation is that there can be name collisions, if an incidental input-XML element name happens to be the same as either an intrinsic [System.Xml.XmlElement]
property name (for single-element properties), or an intrinsic [Array]
property name (for array-valued properties; [System.Object[]]
derives from [Array]
).
In the event of a name collision: If the property being accessed contains:
a single child element ([System.Xml.XmlElement]
), the incidental properties win.
- This too can be problematic, because it makes accessing intrinsic type properties unpredictable - see bottom section.
an array of child elements, the [Array]
type's properties win.
Therefore, the following element names break dot notation with array-valued properties (obtained with reflection command
Get-Member -InputObject 1, 2 -Type Properties, ParameterizedProperty
):
Item Count IsFixedSize IsReadOnly IsSynchronized Length LongLenth Rank SyncRoot
See the last section for a discussion of this difference and for how to gain access to the intrinsic [System.Xml.XmlElement]
properties in the event of a collision.
The workaround is to use explicit enumeration of array-valued properties, using the ForEach-Object
cmdlet, as demonstrated at the top.
Here is a complete example:
[xml] $xml = @'
<PatchScan>
<Machine>
<Product>
<Name>Windows 10 Pro (x64)</Name>
<Item Class="Patch">
<BulletinId>MSAF-054</BulletinId>
<PatchName>windows10.0-kb3189031-x64.msu</PatchName>
<Status>Installed</Status>
</Item>
<Item Class="Patch">
<BulletinId>MSAF-055</BulletinId>
<PatchName>windows10.0-kb3189032-x64.msu</PatchName>
<Status>Not Installed</Status>
</Item>
</Product>
<Product>
<Name>Windows 7 Pro (x86)</Name>
<Item Class="Patch">
<BulletinId>MSAF-154</BulletinId>
<PatchName>windows7-kb3189031-x86.msu</PatchName>
<Status>Partly Installed</Status>
</Item>
<Item Class="Patch">
<BulletinId>MSAF-155</BulletinId>
<PatchName>windows7-kb3189032-x86.msu</PatchName>
<Status>Uninstalled</Status>
</Item>
</Product>
</Machine>
</PatchScan>
'@
# Enumerate the array-valued .Product property explicitly, so that
# the .Item property can successfully be accessed on each XmlElement instance.
$xml.PatchScan.Machine.Product |
ForEach-Object { $_.Item | Select-Object BulletinID, PatchName, Status }
The above yields:
Class BulletinId PatchName Status
----- ---------- --------- ------
Patch MSAF-054 windows10.0-kb3189031-x64.msu Installed
Patch MSAF-055 windows10.0-kb3189032-x64.msu Not Installed
Patch MSAF-154 windows7-kb3189031-x86.msu Partly Installed
Patch MSAF-155 windows7-kb3189032-x86.msu Uninstalled
Further down the rabbit hole: What properties are shadowed when:
Note: By shadowing I mean that in the case of a name collision, the "winning" property - the one whose value is reported - effectively hides the other one, thereby "putting it in the shadow".
In the case of using dot notation with arrays, a feature called member-access enumeration comes into play, which applies to any collection in PowerShell v3+; in other words: the behavior is not specific to the [xml]
type.
In short: accessing a property on a collection implicitly accesses the property on each member of the collection (item in the collection) and returns the resulting values as an array ([System.Object[]]
); .e.g:
# Using member-access enumeration, collect the value of the .prop property from
# the array's individual *members*.
> ([pscustomobject] @{ prop = 10 }, [pscustomobject] @{ prop = 20 }).prop
10
20
However, if the collection type itself has a property by that name, the collection's own property takes precedence; e.g.:
# !! Since arrays themselves have a property named .Count,
# !! member-access enumeration does NOT occur here.
> ([pscustomobject] @{ count = 10 }, [pscustomobject] @{ count = 20 }).Count
2 # !! The *array's* count property was accessed, returning the count of elements
In the case of using dot notation with [xml]
(PowerShell-decorated System.Xml.XmlDocument
and System.Xml.XmlElement
instances), the PowerShell-added, incidental properties shadow the type-intrinsic ones:[2]
While this behavior is easy to grasp, the fact that the outcome depends on the specific input can also be treacherous:
For instance, in the following example the incidental name
child element shadows the intrinsic property of the same name on the element itself:
> ([xml] '<xml><child>foo</child></xml>').xml.Name
xml # OK: The element's *own* name
> ([xml] '<xml><name>foo</name></xml>').xml.Name
foo # !! .name was interpreted as the incidental *child* element
If you do need to gain access to the intrinsic type's properties, use .get_<property-name>()
:
> ([xml] '<xml><name>foo</name></xml>').xml.get_Name()
xml # OK - intrinsic property value to use of .get_*()
[1] If a given element has both an attribute and and element by the same name, PowerShell reports both, as the elements of an array [object[]]
.
[2] Seemingly, when PowerShell adapts the underlying System.Xml.XmlElement
type behind the scenes, it doesn't expose its properties as such, but via get_*
accessor methods, which still allows access as if they were properties, but with the PowerShell-added incidental-but-bona-fide properties taking precedence. Do let us know if you know more about this.