1

Referencing:

https://www.red-gate.com/simple-talk/sysadmin/powershell/powershell-data-basics-xml/

and:

https://stackoverflow.com/a/65264118/4531180

how is a list of elements and their properties printed?

PS /home/nicholas/powershell> 
PS /home/nicholas/powershell> $doc = new-object System.Xml.XmlDocument
PS /home/nicholas/powershell> $file = resolve-path('./bookstore.xml') 
PS /home/nicholas/powershell> $doc.load($file)                                           
PS /home/nicholas/powershell> 
PS /home/nicholas/powershell> $doc.bookstore.book[1].author.first-name
ParserError: 
Line |
   1 |  $doc.bookstore.book[1].author.first-name
     |                                     ~~~~~
     | Unexpected token '-name' in expression or statement.

PS /home/nicholas/powershell> 
PS /home/nicholas/powershell> $doc.bookstore.book[1].author           

first-name last-name
---------- ---------
Margaret   Atwood

PS /home/nicholas/powershell> 
PS /home/nicholas/powershell> $doc.bookstore

bk          book
--          ----
urn:samples {book, book, book, book}

PS /home/nicholas/powershell> 
  • 2
    Be careful with your quoting, I would advise you steer well clear of smart, _(curly)_, quotes, `“./bookstore.xml”`, and use dumb, _(straight)_, quotes instead, `"/bookstore/book[2]"`. – Compo Dec 12 '20 at 12:41
  • Ohhhh, that's a copy paste from another website. I didn't see, thanks. but, not the problem as the file loads fine. – Nicholas Saunders Dec 12 '20 at 12:58
  • 1
    I'd wager that the forum software has not changed only one pair of straight quotes, and left all of the others untouched. If you really did copy/paste them, then you've used them, _(the only other alternative is that you edited the pasted information later, and in doing so have introduced them yourself)_. I see you've now edited your comment above! as well as changing your code and output completely too! – Compo Dec 12 '20 at 13:02
  • it was in fact a copy paste from that website, and I was just omitting to specify "book" in the dot operator. but thanks for checking it out. I understand your point. – Nicholas Saunders Dec 12 '20 at 13:04
  • @Compo: While avoiding non-ASCII quote characters is generally advisable, note that PowerShell happily accepts so-called typographic quote characters as well - the only caveat is that your source-code file must be properly encoded (which in Windows PowerShell means a Unicode encoding _with BOM_) - see [this answer](https://stackoverflow.com/a/55053609/45375). – mklement0 Dec 12 '20 at 16:21
  • As an aside: PowerShell functions, cmdlets, scripts, and external programs must be invoked _like shell commands_ - `foo arg1 arg2` - _not_ like C# methods - `foo('arg1', 'arg2')`. If you use `,` to separate arguments, you'll construct an _array_ that a command sees as a _single argument_. To prevent accidental use of method syntax, use [`Set-StrictMode -Version 2`](https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.core/set-strictmode) or higher, but note its other effects. See [this answer](https://stackoverflow.com/a/65208621/45375) for more information. – mklement0 Dec 12 '20 at 16:21
  • In other words: `resolve-path('./bookstore.xml')` -> `resolve-path './bookstore.xml'` or even `resolve-path ./bookstore.xml` – mklement0 Dec 12 '20 at 16:22

3 Answers3

1

To address the syntax error shown in your question:

# BROKEN: `first-name` cannot be used without quoting
$doc.bookstore.book[1].author.first-name

# OK:
$doc.bookstore.book[1].author.'first-name'

Element name first-name, surfaced in PowerShell as a property name, cannot be used without quotes, because the - is then interpreted as the subtraction operator, resulting in the error that you saw.

In short:

  • Only letters, digits, and _ (underscores) can be used in unquoted property names; ditto for unquoted variable names.[1]

  • When in doubt, quote.


[1] Exact rules for identifier names in PowerShell:

More accurately, PowerShell permits an identifier to be unquoted if it is composed solely of characters from one of the following Unicode categories (defined in .NET as enumeration System.Globalization.UnicodeCategory; the two-letter shorthands listed in parentheses can be used with \p{<shortCategoryName>} in regular expressions):

Types of identifiers:

  • property names (.foo)
  • keys in hashtable literals (@{ foo = ... })
  • variable names ($foo)

Identifiers do support additional characters, but then require additional syntax:

Identifiers that do not adhere to the above rules must be:

  • property names and hashtable keys: quoted ('last-name' or "last-name")
  • variable names: enclosed in {...} (${last-name})

Note that extended rules apply to command names (names of functions, cmdlets, script or executable files as well as aliases thereof) and module names:

  • In addition to the above, the following are allowed without quoting:
    • - ("hyphen", "minus", loosely: "dash")
    • . ("period", "full stop")
    • In file paths also: \) and /)

Among command names, with the exception of function and cmdlet names (which you cannot even define with other characters), you can get away with using additional characters in names, as long as you quote the command name on invocation.

However, doing so is ill-advised, as users will generally expect commands not to require such quoting; to use a contrived example: Set-Alias 'a&b' Get-Date; & 'a&b' technically works, but the awkwardness of the invocation (quoting, which then requires &) makes this a poor choice.

mklement0
  • 382,024
  • 64
  • 607
  • 775
1

To address the for-display formatting problem:

If you look closely at the sample output from your own answer, you'll see that even though the display is mostly helpful, the value of the author property is author instead of showing the (presumed) first-name and last-name child-element values.

The problem is that PowerShell's default output formatting represents child elements that have:

  • at least one attribute
  • and/or at least one child element themselves

by the element's name only.

Especially with deeply nested elements this results in unhelpful output.

Workarounds, possibly in combination:

  • Access the .OuterXml or .InnerXml property of such elements, which contains the full XML text of the element with / without the element's tags themselves.

    • This will likely only be helpful with perhaps at most another level of nesting will be visually helpful, given that the XML text is a single-line representation that is not pretty-printed.

    • You can pipe .OuterXml / InnerXml values to a pretty-printing function, which requires some extra work, however, because no such functionality is directly exposed by PowerShell.

  • Use Select-Object (or, for display purposes only, a Format-* cmdlet such as Format-Table) with calculated properties.

    • While this allows you full control over what is displayed, it is more work.

See the examples below.


# Sample XML document
$xmlDoc = [xml] @"
<?xml version="1.0"?>
<bookstore>
   <book id="bk101">
      <author>
        <first-name>Matthew</first-name>
        <last-name>Gambardella</last-name>
      </author>
      <title>XML Developer's Guide</title>
      <genre>Computer</genre>
      <price>44.95</price>
      <publish_date>2000-10-01</publish_date>
      <description>An in-depth look at creating applications with XML.</description>
   </book>
   <book id="bk102">
      <author>
        <first-name>Kim</first-name>
        <last-name>Rall</last-name>
      </author>
      <title>Midnight Rain</title>
      <genre>Fantasy</genre>
      <price>5.95</price>
      <publish_date>2000-12-16</publish_date>
      <description>A former architect battles corporate zombies, an evil sorceress, and her own childhood to become queen of the world.</description>
   </book>
</bookstore>
"@

To get a helpful representation of all <book> elements including the <author> child elements' <first-name> and <last-name> child elements via Select-Object and a calculated property:

$xmldoc.bookstore.book | Select-Object id, 
   @{ n='author'; e={ $_.author.'first-name' + ' ' + $_.author.'last-name'} }, 
   title, genre, price, publish_date, description

This yields (note how the author property now lists first and last name):

id           : bk101
author       : Matthew Gambardella
title        : XML Developer's Guide
genre        : Computer
price        : 44.95
publish_date : 2000-10-01
description  : An in-depth look at creating applications with XML.

id           : bk102
author       : Kim Rall
title        : Midnight Rain
genre        : Fantasy
price        : 5.95
publish_date : 2000-12-16
description  : A former architect battles corporate zombies, an evil sorceress, and her own childhood to become queen of the world.

To get a helpful representation of all <book> elements via pretty-printed XML, via an auxiliary System.Xml.Linq.XDocument instance:

# Load the assembly that contains XDocument.
# Note: Required in Windows PowerShell only, and only once per session.
Add-Type -AssemblyName System.Xml.Linq

$xmldoc.bookstore.book | ForEach-Object {
  ([System.Xml.Linq.XDocument] $_.OuterXml).ToString()
}

This yields (a pretty-printed XML representation):

<book id="bk101">
  <author>
    <first-name>Matthew</first-name>
    <last-name>Gambardella</last-name>
  </author>
  <title>XML Developer's Guide</title>
  <genre>Computer</genre>
  <price>44.95</price>
  <publish_date>2000-10-01</publish_date>
  <description>An in-depth look at creating applications with XML.</description>
</book>
<book id="bk102">
  <author>
    <first-name>Kim</first-name>
    <last-name>Rall</last-name>
  </author>
  <title>Midnight Rain</title>
  <genre>Fantasy</genre>
  <price>5.95</price>
  <publish_date>2000-12-16</publish_date>
  <description>A former architect battles corporate zombies, an evil sorceress, and her own childhood to become queen of the world.</description>
</book>

Note that you could wrap the formatting code in a simple (filter) function named Format-Xml that you could put in your $PROFILE file (in Windows PowerShell, also place the Add-Type -AssemblyName System.Xml.Linq there, above it):

filter Format-Xml { ([System.Xml.Linq.XDocument] $_.OuterXml).ToString() }

Now the formatting is as simple as:

$xmldoc.bookstore.book | Format-Xml
mklement0
  • 382,024
  • 64
  • 607
  • 775
0

just wasn't using books:

PS /home/nicholas/powershell> 
PS /home/nicholas/powershell> $doc.bookstore.book

genre           : novel
publicationdate : 1997
ISBN            : 1-861001-57-8
title           : Pride And Prejudice
author          : author
price           : 24.95

genre           : novel
publicationdate : 1992
ISBN            : 1-861002-30-1
title           : The Handmaid's Tale
author          : author
price           : 29.95

genre           : novel
publicationdate : 1991
ISBN            : 1-861001-57-6
title           : Emma
author          : author
price           : 19.95

genre           : novel
publicationdate : 1982
ISBN            : 1-861001-45-3
title           : Sense and Sensibility
author          : author
price           : 19.95


PS /home/nicholas/powershell> 

whoops.