Sublime Text 3 - Find and remove multiple searching by "id" in multiple files

Question

I have several html files with same <section> but different content.

I would like to know if it is possible for me to remove these sections in multiple files using the sublime text

Exemple:

<section class="all-classes" id="section1">
     content 
</section>
<section class="all-classes" id="section2-do-not-remove-section">
     content 
</section>
<section class="all-classes" id="section3">
     content 
</section>
<section class="all-classes" id="section4">
     content 
</section>

in this example I would like to remove sections 1, 3 and 4 and keep section 2

You want to delete all sections where the format of the id is `section#` where `#` is 1,2,3,4,5,6,7,8,9,10,11,... ? — Niel Godfrey Pablo Ponciano, Sep 03 '21 at 04:02
@Ouroborus From [What topics can I ask about here?](https://stackoverflow.com/help/on-topic) in the [help], software questions are allowed if they cover *"[...] software tools commonly used by programmers".* Sublime Text, like Vim, Emacs, VSCode, etc., is a programming editor, and there are [tens of thousands of questions](https://stackoverflow.com/questions/tagged/vim+or+vi+or+emacs+or+visual-studio-code) about them on this site that are perfectly on-topic. Also, this is a programming question because the answer is to use an HTML parser. — MattDMo, Sep 03 '21 at 11:31
This is actually a job for an HTML parser, not regex. See [this](https://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags) for a good laugh, but also for some good answers explaining why regex is not the tool for this job. [`BeautifulSoup4`](https://www.crummy.com/software/BeautifulSoup/bs4/doc/) or [`lxml`](https://lxml.de) are the tools of choice for Python, I don't know about other languages. — MattDMo, Sep 03 '21 at 11:34

score 0 · Answer 1 · answered Sep 06 '21 at 01:53

As mentioned by MattDMo in the comments, an HTML parser would be your best option for this job.

For simple cases where you just need a quick find+replace, you may use this RegEx:

<section.*id="section[\d]*"[\s\S]*?<\/section>

See it in action here.

Where:

<section.* - Catch text that starts with the tag section e.g. <section class="all-classes"
id="section[\d]*" - Catch the ids where the name is section followed by a number e.g. id="section32"
[\s\S]*? - Catch all characters (whitespaces or not) in a non-greedy way. This is to prevent spanning across multiple sections.
<\/section> - Catch the closing tag </section>. Since this was captured in a non-greedy way, this will always be the closest </section> tag.

WARNING: If you have nested sections (a section within a section), this will not work. You have to use an HTML parser for that.

Sublime Text 3 - Find and remove multiple searching by "id" in multiple files

1 Answers1