1

I know this type of search has been address in a few other questions here, but for some reason I can not get it to work in my scenario.
I have a text file that contains something similar to the following patter:

some text here done
12345678_123456 226-
more text
some more text here done
12345678_234567 226-

I'm trying to find all cases where done is followed by 226- on the next line, with the 16 characters proceeding. I tried grep -Pzo and pcregrep -M but all return nothing.

I attempted multiple combinations of regex to take in account the 2 lines and the 16 chars in between. This is one of the examples I tried with grep:

grep -Pzo '(?s)done\n.\{16\}226-' filename

Related posts:

slybloty
  • 6,346
  • 6
  • 49
  • 70

2 Answers2

1

Generalize it to this (?m)done$\s+.*226-$

Because requiring a \n after 226- at end of string is a bad thing.
And not requiring a \n after 226- is also a bad thing.
Thus, the paradox is solved with (\n|$) but why the \n at all?

Both problems solved with multiline and $.

https://regex101.com/r/A33cj5/1

0

You must not escape { and } while using -P (PCRE) option in grep. That escaping is only for BRE.

You can use:

grep -ozP 'done\R.{16}226-\R' file

done
12345678_123456 226-
done
12345678_234567 226-

\R will match any unicode newline character. If you are only dealing with \n then you may just use:

grep -ozP 'done\n.{16}226-\n' file
anubhava
  • 761,203
  • 64
  • 569
  • 643
  • 1
    You have no idea how much time I wasted messing around with this and never thought of not escaping the `{` `}`. – slybloty Nov 03 '17 at 20:53
  • @slybloty Escapes are tricky, and work differently in different Regex languages. For Perl, there's a simple rule: "Any escaped punctuation mark is interpreted as a literal; and any non-escaped alpha-numeric character is interpreted as a literal." `grep`, `egrep`, `vim` and others deviate from this basic rule to varying extents; just memorize the specific exceptions if you need to use those. – jpaugh Nov 03 '17 at 21:24