I have a directory of similar structured HTML files (two examples given):
File-1.html
<html>
<body>
<div class="foo">foo</div>
<div class="bar"><div><p>bar</p></div></div>
<div class="baz">baz</div>
</body>
</html>
File-2.html
<html>
<body>
<div class="foo">foo</div>
<div class="bar"><div><p>apple<br>banana</p></div></div>
<div class="baz">baz</div>
</body>
</html>
I am trying to create a Powershell script to return the contents of the bar
div, stripped from all html:
For File-1.html: bar
For File-2.html: apple banana
I now have:
$directory = "C:\Users\Public\Documents\Sandbox\HTML"
foreach ($file in Get-ChildItem($directory))
{
$content = Get-Content "$directory\$file"
echo $content.ParsedHtml.getElementById("bar").innerHTML
}
This returns an error:
You cannot call a method on a null-valued expression.
At C:\Users\Public\Documents\Sandbox\parse-html.ps1:9 char:2
+ echo $content.ParsedHtml.getElementById("bar").innerHTML`
I don't understand this error, as bar
is an HTML element that exists.
What am I doing wrong?