1

Question

Suppose I have a file like this:

I've got a loverly bunch of coconut trees.

Newlines!

Bahahaha
Newlines!
the end.

I'd like to replace an occurence of "Newlines!" that is surrounded by blank lines with (say) NEWLINES!. So, ideal output is:

I've got a loverly bunch of coconut trees.

NEWLINES!

Bahahaha
Newlines!
the end.

Attempts

Ignoring "surrounded by newlines", I can do:

perl -p -e 's@Newlines!@NEWLINES!@g' input.txt

Which replaces all occurences of "Newlines!" with "NEWLINES!".

Now I try to pick out only the "Newlines!" surrounded with \n:

perl -p -e 's@\nNewlines!\n@\nNEWLINES!\n@g' input.txt

No luck (note - I don't need the s switch because I'm not using . and I don't need the m switch because I'm not using ^and $; regardless, adding them doesn't make this work). Lookaheads/behinds don't work either:

perl -p -e 's@(?<=\n)Newlines!(?=\n)@NEWLINES!@g' input.txt

After a bit of searching, I see that perl reads in the file line-by-line (makes sense; sed does too). So, I use the -0 switch:

perl -0p -e 's@(?<=\n)Newlines!(?=\n)@NEWLINES!@g' input.txt

Of course this doesn't work -- -0 replaces new line characters with the null character.

So my question is -- how can I match this pattern (I'd prefer not to write any perl beyond the regex 's@pattern@replacement@flags' construct)?

Is it possible to match this null character? I did try:

perl -0p -e 's@(?<=\0)Newlines!(?=\0)@NEWLINES!@g' input.txt

to no effect.

Can anyone tell me how to match newlines in perl? Whether in -0 mode or not? Or should I use something like awk? (I started with sed but it doesn't seem to have lookahead/behind support even with -r. I went to perl because I'm not at all familiar with awk).

cheers.

(PS: this question is not what I'm after because their problem had to do with a .+ matching newline).

Community
  • 1
  • 1
mathematical.coffee
  • 55,977
  • 11
  • 154
  • 194
  • 1
    "-0 replaces new line characters with the null character." is not true. Try `perl -MO=Deparse -0pe'...'` to see effect of `-0`. – ikegami Feb 05 '12 at 08:17
  • @ikegami Where can I read about this "-0" mode of perl? I don't know what to google.. – user13107 Apr 08 '13 at 16:28

3 Answers3

2

Following should work for you:

perl -0pe 's@(?<=\n\n)Newlines!(?=\n\n)@NEWLINES!@g'
anubhava
  • 761,203
  • 64
  • 569
  • 643
  • Huh, I just tried what I tried earlier and it worked. I must have been mashing keys earlier :P (also, thanks for picking up the `\n\n`). cheers! – mathematical.coffee Feb 05 '12 at 09:03
1

If the file is small enough to be slurped into memory all at once:

perl -0777 -pe 's/\n\nNewlines!(?=\n\n)/\n\nNEWLINES!/g'

Otherwise, keep a buffer of the last three lines read:

perl -ne 'push @buffer, $_; $buffer[1] = "NEWLINES!\n" if @buffer == 3 && ' \
      -e 'join("", @buffer) eq "\nNewlines!\n\n"; ' \
      -e 'print shift @buffer if @buffer == 3; END { print @buffer }'
Sean
  • 29,130
  • 4
  • 80
  • 105
  • cheers, I forgot to mention the files will always be big enough to read entirely into memory. One question -- what's the `-0777`? I know `-0` but not of `0777`. – mathematical.coffee Feb 05 '12 at 09:04
  • 1
    @mathematical.coffee `-0` takes an argument in the form of an octal, and 777 is by convention used as a sufficiently large number to cause file slurping. – TLP Feb 05 '12 at 11:42
1

I think they way you went about things caused you to combine possible solutions in a way that didn't work.

if you use the inline editing flag you can do it like this:

perl -0p -i.bk -e 's/\n\nNewlines!\n\n/\n\nNEWLINES!\n\n/g' input.txt

I have doubled the \n's to make sure you only get the ones with empty lines above and below.

Ilion
  • 6,772
  • 3
  • 24
  • 47