For me, your regex works as expected. Given an input file file.md
containing:
{% highlight ruby %}
{% endhighlight %}
not this line, though
nor {%this%}
When I run your command (avoiding UUoC), I get the output shown:
$ egrep '{% +' file.md
{% highlight ruby %}
{% endhighlight %}
$
You've not identified which version of egrep
you are using and which platform you are using it on. I'm running Mac OS X 10.11.6 and using egrep (BSD grep) 2.5.1-FreeBSD
(but I also get the same result with GNU Grep 2.25).
You should be aware, though, that {
is a metacharacter to egrep
, and the problem may be that it is not handling the initial {
as you expect.
For example, here's a more complex egrep
invocation that should only select the endhighlight
line:
$ egrep '\{% {1,4}[a-z]{4,20} {1,4}%\}' file.md
{% endhighlight %}
$
I used the backslashes to escape the first and last braces. The {n,m}
notation means n ≤ x ≤ m matches of the preceding regex (blank and [a-z]
). You can omit ,m
; you can use {4,}
too — check the manual to understand these. However, on my machine, I can also run:
$ egrep '{% {1,4}[a-z]{4,20} {1,4}%}' file.md
{% endhighlight %}
$
Presumably, because the first {
doesn't start an {n,m}
sequence, it is treated as an ordinary character.
If you look at the POSIX specification for Extended Regular Expressions, you'll find that it says using {
like that is undefined behaviour:
*+?{
The <asterisk>
, <plus-sign>
, <question-mark>
, and <left-brace>
shall be special except when used in a bracket expression (see RE Bracket Expression). Any of the following uses produce undefined results:
If these characters appear first in an ERE, or immediately following a <vertical-line>
, <circumflex>
, or <left-parenthesis>
If a <left-brace>
is not part of a valid interval expression (see EREs Matching Multiple Characters)
So, according to POSIX, you are using a regex that produces undefined results. Therefore, you are getting a result that POSIX deems acceptable.
Clearly, you should be able to use the following and get the result you expect:
$ egrep '\{% +' file.md
{% highlight ruby %}
{% endhighlight %}
$