3

I'm new to Regexp::Grammars, and am having trouble matching a multi-line pattern. I have this input:

my $text = <<EOD;
HEADER:
This is a multi-line section, because the
second line is down here.

EOD

and this grammar:

use Regexp::Grammars;
my $parser = qr{
  <nocontext:>
  <doc>
  <rule: doc>           <[section]>+
  <rule: section>       <label> : <text> (\n\n | $)
  <token: label>        [A-Z0-9_&/ -]+
  <token: text>         [^\n]*
}xms;

I'm only matching the first line of the section, but I'd like to capture all text up to a blank line or end of input. Can anyone see what I'm doing wrong?

Jeff French
  • 1,151
  • 1
  • 12
  • 19
  • Well, what you did wrong is tell it `` cannot contain newlines, so it doesn't. What isn't so obvious is the correct solution. – cjm Jul 14 '12 at 18:11
  • @cjm, yes, good point. I should have shown my other attempts. I had tried defining as .*, but that ate up everything, including subsequent sections. I though .*? might work, but that stopped at the first newline. – Jeff French Jul 14 '12 at 18:21
  • @Jeff, because `.` doesn't match `\n` unless you use `/s`, so `(?s:.*?)` would probably work. Or maybe `.+(?:\n.+)*`. – Qtax Jul 14 '12 at 19:22

1 Answers1

1

One solution is to change <text> as follows:

<token: text>         (?:(?!\n\n).)*

This matches 0 or more characters that are not a newline followed by another newline. It's probably not the best possible solution, but it works.

cjm
  • 61,471
  • 9
  • 126
  • 175