0

I'm trying to write a Regex that searches the source code of a series of webpages in a CSV file. I'm using the following to do the match:

 $linkContent = $web.DownloadString($linkToBeConverted)  
 $object = [regex]::Matches($linkContent, $regex)

I'm trying to search in a list with class="menu" to see if it has links somewhere in it. Unfortunately, I seem to be matching way more than I need. I want a way to stop the match when I hit a certain string. Specifically div class="test" as per the example below.

This is my regular expression now:

(?sm)<ul class="menu">.*?(<a href="h).*?(<\/ul>)

The following is the source code I'm trying to search in. This SHOULD NOT be a match if my regular expression was correct. However, because there is a link somewhere between and the second list (which is not defined as class="menu") I get a match. Is there any way I can write this regular expression so that it stops when div class="test" is found? As a result of the template, div class="test" should always be in the code right after the menu list.

<ul class="menu">
   <li>
       <p>Yes there are paragraph tags and random stuff in these lists...</p>
   </li>
   <li>
       <div><span>Example</span>
        </div>It's pretty random
   </li>
   <li>Nothing here!</li>
</ul>
<div class="test">
<p><a href="http://match.html"></p>
<ul>
   <li>Unfortunately this will cause a match since there's another list</li>
</ul>

Thank you so so much for your help in advance! I've been working on this all morning and I'm completely lost. If there's a way to do this in PowerShell I'm open to that as well.

Oaryx
  • 67
  • 4
  • are you trying to just get the links in the menu class? – ArcSet Aug 17 '17 at 14:25
  • I'm just trying to return "true" if there are any links in the menu class and "false" if there aren't. – Oaryx Aug 17 '17 at 14:27
  • even if there is nothing in the links – ArcSet Aug 17 '17 at 14:31
  • Ah, good point. It has to link to a website, not be an anchor to another part of the page. Anchors are considered "false". I know that's weird, but I do have a reason for it. ;) An empty link is also false. – Oaryx Aug 17 '17 at 14:37

0 Answers0