6

Given the following xml:

<foo bar="&amp;foobar">some text</foo>

I need to get the value of the bar attribute without it being unescaped. Every method I've tried thus far in PowerShell results in this value:

&foobar

Rather than this:

&amp;foobar

I need the latter, as I need the literal, properly escaped value to persist.

If I do this:

[xml]$xml = "<foo bar='&amp;foobar'>some text</foo>"
$xml.foo.bar

The attribute value is unescaped (i.e. &foobar).

If I do this:

$val = $xml | select-xml "/foo/@bar"
$val.Node.Value

The attribute value is unescaped (i.e. &foobar).

What is the best way to ensure that I get the original, escaped value of an attribute with PowerShell?

kiprainey
  • 3,241
  • 4
  • 29
  • 29

3 Answers3

12
[Security.SecurityElement]::Escape($xml.foo.bar)
Shay Levy
  • 121,444
  • 32
  • 184
  • 206
  • This is probably better than using [HtmlEncode](https://stackoverflow.com/a/13976028/712526), but why is this not under `System.Xml`? – jpaugh Oct 02 '19 at 20:17
8

Using the sample XML above, each of the following will produce the original, escaped value for the bar attribute:

Using XPath:

$val = $xml | select-xml "/foo/@bar"
$val.Node.get_innerXml()

Using PowerShell's native XML syntax:

$xml.foo.attributes.item(0).get_innerXml()
kiprainey
  • 3,241
  • 4
  • 29
  • 29
6

You can also use

[System.Web.HttpUtility]::HtmlEncode($xml.foo.bar).

There is a good answer on html encoding with PowerShell found here: What is the best way to escape html specific characters in a string in (PowerShell)

I'm not sure its any better than @shay's answer because the data is still passing through the XML parser, which returns the unescaped value, which is then passed back through a function to escape it again.

The 'content' has been manipulated in any case and its not 'the original content'. It may be splitting hairs, but in the past when I've needed non-repudiation on what was originally sent, I've stored the whole blob as a text.

It may be acceptable to grab the 'text' by accessing @bar attributes OuterXml property. That OuterXml property will return:

bar="&amp;foobar"

From there, we can do something like:

$xml.foo.attributes['bar'].OuterXml.Split("=")[1]

Which returns:

"&amp;foobar"

I think this is where we want to end up, but you can probably do that in a little nicer way. :)

Community
  • 1
  • 1
Zach Bonham
  • 6,759
  • 36
  • 31
  • Thanks for the answer Zach. What you an Shay have proposed will both work to get the correct value, but you are also correct in that the value is being unescaped and reescaped. I'm holding out hope that there is a straightforward means of accessing the unmanipulated value. – kiprainey Dec 20 '12 at 16:25
  • Hmmm....$xml.foo.attributes['bar'].OuterXml returns unescaped data, along with the attribute though. Seems like there should be an InnerText property on the attribute :). Spelunking through XmlAttribute class now – Zach Bonham Dec 20 '12 at 16:31
  • 1
    Using my example above, $val.Node_getInnerXml() returns the correct, escaped value, as does $xml.foo.attributes.item(0).get_innerXml. I think we have a winner. – kiprainey Dec 20 '12 at 17:04
  • Thanks for your help -- you got me moving in the right direction. – kiprainey Dec 20 '12 at 17:10
  • That's what its all about! – Zach Bonham Dec 20 '12 at 17:14