I am trying to parse a page from https://pinvoke.net using Windows PowerShell. Normally when I have an XML string, I can convert it to a more easily workable object by by casting the string to the [xml]
type. However, when I try to parse the following page, I get an error. It doesn't like the src
attribute on line 14:
$page = ( Invoke-WebRequest https://www.pinvoke.net/default.aspx/advapi32/CreateProcessAsUser.html ).Content
$xmlPage = [xml]$page # throws an error
The error (truncated since the message looks like it includes the full page content):
Cannot convert value "XML STRING HERE" to type "System.Xml.XmlDocument".
Error: "'src' is an unexpected token. The expected token is '='. Line 14, position 15."
The line in question looks like this:
<script async src = "https://www.googletagmanager.com/gtag/js?id=UA-115015704-1" ></script>
If I copy the XML to a file and either remove the line or remove async
, then read the file and attempt to convert it again it gets further but I keep getting met with additional XML errors (there are two total async
attributes I removed before I gave up due to additional parsing errors).
Why does the casting conversion with [xml]
fail?
Edit:
Looks like ConvertTo-Xml
converts the .NET object into an XML string. It's represented under the XmlDocument
type but the most I can extract out of it is the same string. I've re-titled the question accordingly and removed the statements that ConvertTo-Xml
was working correctly for me.