I'm trying to write a Regex that searches the source code of a series of webpages in a CSV file. I'm using the following to do the match:
$linkContent = $web.DownloadString($linkToBeConverted)
$object = [regex]::Matches($linkContent, $regex)
I'm trying to search in a list with class="menu" to see if it has links somewhere in it. Unfortunately, I seem to be matching way more than I need. I want a way to stop the match when I hit a certain string. Specifically div class="test" as per the example below.
This is my regular expression now:
(?sm)<ul class="menu">.*?(<a href="h).*?(<\/ul>)
The following is the source code I'm trying to search in. This SHOULD NOT be a match if my regular expression was correct. However, because there is a link somewhere between and the second list (which is not defined as class="menu") I get a match. Is there any way I can write this regular expression so that it stops when div class="test" is found? As a result of the template, div class="test" should always be in the code right after the menu list.
<ul class="menu">
<li>
<p>Yes there are paragraph tags and random stuff in these lists...</p>
</li>
<li>
<div><span>Example</span>
</div>It's pretty random
</li>
<li>Nothing here!</li>
</ul>
<div class="test">
<p><a href="http://match.html"></p>
<ul>
<li>Unfortunately this will cause a match since there's another list</li>
</ul>
Thank you so so much for your help in advance! I've been working on this all morning and I'm completely lost. If there's a way to do this in PowerShell I'm open to that as well.