1

This is what I put into PowerShell:

PS > $source = "http://www.bing.com/search?q=sqrt(2)"
PS > $result = Invoke-WebRequest $source
PS > $resultContainer = $result.ParsedHtml.GetElementById("results_container")

This is the error message I got back:

The property 'ParsedHtml' cannot be found on this object. Verify that the property exists.                                                                                   At line:1 char:1                                                                                                                                                             + $resultContainer = $result.ParsedHtml.GetElementById("results_contain ...                                                                                                  
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : NotSpecified: (:) [], PropertyNotFoundException
    + FullyQualifiedErrorId : PropertyNotFoundStrict
JasonMArcher
  • 14,195
  • 22
  • 56
  • 52
Ahmad Taj
  • 147
  • 1
  • 6

2 Answers2

4

I don't believe you can do this (at least not yet) with PowerShell on non-Windows platforms. To parse the HTML content, PowerShell uses MSHTML.DLL and/or other Internet Explorer/Edge components which do not exist outside Windows. Note that GetElementById just proxies to the COM object and there's no COM objects in your environment.

You can inspect the RawContent property of the object returned by Invoke-WebRequest and parse that string yourself to look for the content you want, but parsing HTML with regular expressions is a non-starter so you'll have to use other methods.

BTW, I was unable to locate an element with an id of results_container on the page you're using in your example.

alroc
  • 27,574
  • 6
  • 51
  • 97
2

What works (but is a bit messy) is to use AngleSharp in Powershell as .Net assembly. It is also suggested in a Powershell github issue.

[string]$html = "<!DOCTYPE html>
<html lang=en>
    <meta charset=utf-8>
    <meta name=viewport content=""initial-scale=1, minimum-scale=1, width=device-width"">
    <title>Error 404 (Not Found)!!1</title>
    <a href=//www.google.com/><span id=logo aria-label=Google></span></a>
    <p><b>404.</b> <ins>That’s an error.</ins>
    <p>The requested URL <code>/error</code> was not found on this server.  <ins>That’s all we know.</ins>";

#Loads assembly for angle sharp: https://stackoverflow.com/questions/39257572/loading-assemblies-from-nuget-packages 
#WARNING: probably in a non-portable way.
$standardAssemblyFullPath = (Get-ChildItem -Filter *.dll -Recurse (Split-Path (get-package AngleSharp).Source)).FullName | Where-Object {$_ -like "*standard*"}
Add-Type -Path $standardAssemblyFullPath

$parser = New-Object AngleSharp.Parser.Html.HtmlParser
$document = $parser.Parse($html);

$elements = $document.All | Where-Object {$_.id -eq "logo"};

Write-Host $elements.OuterHtml
Diederik
  • 5,536
  • 3
  • 44
  • 60
  • 1
    `HtmlParser` is a child of `Parser` namespace, not `Html` namespace. `HtmlParser` class is at `AngleSharp.Html.Parser.HtmlParser` `$parser = New-Object AngleSharp.Html.Parser.HtmlParser` – Mavaddat Javid Sep 16 '21 at 19:39