-2

Wondering how to write a regex in Notepad++ to eliminate the following lines of text from a file.

<div class="navigation">
    <div class="alignleft">&laquo; <a href="/2008/11/28/107/" rel="prev">Subject Matter for Consideration</a>
</div>
    <div class="alignright">
<a href="/2009/01/18/109/" rel="next">Book Reviews</a> &raquo;</div>
<br>
</div>

My first thought was to do it in 2 steps. For step 1 I used for the Replace tool Find what field:

 <div class="navigation">.*</div>

thinking it would stop at the first </div>. Instead it deleted everything from the start of the string shown thru to the very last </div> in the document - which sort of makes since because I guess the .* is greedy.

So how do I get it to stop at the 1st </div>? Alternatively, is it possible to remove all the lines of code with a single regex?

Note that I have 300+ files with similar lines with the link and anchor text changing in each so being able to use Notepad's Find in Files option with regex would save a lot of work.

Thanks.

Jim
  • 61
  • 9
  • 3
    `Note that I have 300+ files with similar lines` ... we need to see all the things you want to match and remove. Otherwise, any answer given will be subject to endless follow up comments. – Tim Biegeleisen Mar 04 '23 at 05:13
  • Does this answer your question? [My regex is matching too much. How do I make it stop?](https://stackoverflow.com/questions/22444/my-regex-is-matching-too-much-how-do-i-make-it-stop) – Nick Mar 04 '23 at 07:58
  • 2
    Regex and HTML/XML are not good friends, use a parser with your favorite scripting language. – Toto Mar 04 '23 at 10:04
  • @Nick: The solution provided in the linked question doesn't work in this case. – Toto Mar 04 '23 at 10:06
  • @Toto although it doesn't solve OPs overall problem, it does answer "how do I get it to stop at the 1st " – Nick Mar 04 '23 at 21:28

2 Answers2

0

@Nick - thank you for pointing out the post about adding the ? to the .* to make it none greedy. Using that info, I came up with a way to delete the entire block in one step. The regex I used to delete the following block of text:

<div class="navigation">
    <div class="alignleft">&laquo; <a href="/2008/11/28/107/" rel="prev">Subject Matter for Consideration</a>
</div>
    <div class="alignright">
<a href="/2009/01/18/109/" rel="next">Book Reviews</a> &raquo;</div>
<br>
</div>

from the file was:

<div class="navigation">.*?<br>\r\n</div>
Jim
  • 61
  • 9
-1

you can simply use this regex for that. It removes all instances of the block from your files.

<div class="navigation">.*?</div>\s*
Lakshitha Samod
  • 383
  • 3
  • 10
  • Unfortunately this option will grab 2 div tags but will stop at the first closing div tag leaving the html tags unbalanced. – Jim Mar 04 '23 at 17:22