I am trying to retrieve the data from the list on the left of this website.
The data is structured like this :
<ul class="sections nice_sel">
...
<li class="">
<a href="/c/london-bar/set-overviews-england-and-wales">IMPORTANT_DATA</a>
</li>
...
</ul>
Where I need to retrieve each of the IMPORTANT_DATA
inner HTML items from the list.
I tried following this question to get the code:
$url = "http://www.legal500.com/c/london-bar"
$html = Invoke-WebRequest $url
$thelist = $html.ParsedHtml.body.getElementsByTagName('ul') |
Where {$_.getAttributeNode('class').Value -eq 'sections nice_sel'}
But I'm not sure how to get the child (<li>
) elements from this.
I also considered using XPath, but I can't seen to pass my $html
variable into -Path
:
Select-XML -Path $html -XPath "//*[contains(@class, 'sections nice_sel')]"
Select-XML : Cannot find drive. A drive with the name 'PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http' does not exist. At line:1 char:1 + Select-XML -Path $html -XPath "//*[contains(@class, 'Test')]" + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + CategoryInfo : ObjectNotFound: (html ...rict//EN" "http:String) [Select-Xml], Driv eNotFoundException + FullyQualifiedErrorId : DriveNotFound,Microsoft.PowerShell.Commands.SelectXmlCommand
I have also tried :
$url = "http://www.legal500.com/c/london-bar"
$html = Invoke-WebRequest $url
$thelist = $html.ParsedHtml.body.getElementsByTagName('a') |
Where {$_.getAttributeNode('href').Value -contains '/c/london-bar/'}
But for some reason this returns nothing .. (as in $thelist
is empty)