grep pattern matches but does not if inverted

Question

I'm trying to do a invert match of an html multi line p tag block with grep.

It works non inverted:

grep -Pz '(?s)\s*<p id="internal_version">.*?</p>\s' test.html

but gives exit code 1 if inverted

grep -Pzv '(?s)\s*<p id="internal_version">.*?</p>\s' test.html

I would have expected to get the not matching content of the file.

Example test.html:

<!DOCTYPE html>
<html lang="en">
  <body>
    <div id="page-content-wrapper">
        <div class="container-fluid">
            <p id="internal_version">
                <font size="+1" color="red"><strong>Attention:</strong> <br>
                Text <br> 
                More Text 123 !%&/
            </p>
        </div>
    </div>
</body>
</html>

Use a proper syntax aware parser like `xmllint`/`xmlstarlet` than `grep` — Inian, Jun 22 '20 at 08:48
All-time classic https://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454 — bipll, Jun 22 '20 at 08:52
@bipll reading the two most up voted answers it looks like I entered some sort of war zone between "don't use regex for html!!111" and "It's some times fine". I will have a look at xmllint/xmlstarlet @Inian But still im wondering why `grep -Pz` works and `grep -Pzv` doesn't — Nils, Jun 22 '20 at 09:01
Exactly for the same reason why would `echo 123 | grep -v 2` print nothing and return 1. When you add `-z` your whole file is treated as a single line, so if you invert match, its content is not matching anymore (as it matches the pattern that was inverted). — bipll, Jun 22 '20 at 09:06

grep pattern matches but does not if inverted

0 Answers0