I am trying to use GREP to select multiple-line records from a file.
The records look something like that
########## Ligand Number : 1
blab bla bla
bla blab bla
########## Ligand Number : 2
blab bla bla
bla blab bla
########## Ligand Number : 3
bla bla bla
<EOF>
I am using Perl RegEx (-P).
To bypass the multiple line limitation in GREP, I use grep -zo. This way, the parser can consume multiple lines and output exactly what I want. generally, it works fine.
However, the problem is that the delimiter here is two empty lines after the end of last record line (three consecutive '\n' characters: one for end line and two for two empty lines).
When I try to use an expression like
grep -Pzo '^########## Ligand Number :\s+\d+.+?\n\n\n' inputFile
it returns nothing. It seems that grep can't tolerate consecutive '\n' characters.
Can anybody give an explanation?
P.S. I bypassed it already by translating the '\n' characters to '\a' first, then translating them back. like this following example:
cat inputFile | tr '\n' '\a' | grep -Po '########## Ligand Number :\s+\d+\a.+?\a\a\a' | tr '\a' '\n'
But I need to understand why couldn't GREP understand the '\n\n\n' pattern.