8

In a regular expression, in multiline mode, ^ and $ stand for the start and end of line. How can I match the end of the whole string?

In the string

Hello\nMary\nSmith\nHello\nJim\nDow

the expression

/^Hello(?:$).+?(?:$).+?$/ms

matches Hello\nMary\Smith.

I wonder whether there is a metacharacter (like \ENDSTRING) that matches the end of the whole string, not just line, such that

/^Hello(?:$).+?(?:$).+?\ENDSTRING/ms

would match Hello\nJim\nDow. Similarly, a metacharacter to match the start of the whole string, not a line.

Alexander Gelbukh
  • 2,104
  • 17
  • 29

2 Answers2

11

There are indeed assertions (perlre) for that

\A Match only at beginning of string
\Z Match only at end of string, or before newline at the end

...
The \A and \Z are just like ^ and $, except that they won't match multiple times when the /m modifier is used, while ^ and $ will match at every internal line boundary. To match the actual end of the string and not ignore an optional trailing newline, use \z.

Also see Assertions in perlbackslash.

I am not sure what you're after in the shown example so here is another one

perl -wE'$_ = qq(one\ntwo\nthree); say for /(\w+\n\w+)\Z/m'

prints

two
three

while with $ instead of \Z it prints

one
two

Note that the above example would match qq(one\ntwo\three\n) as well (with a trailing newline), what may or may not be suitable. Please compare \Z and \z from the above quote for your actual needs. Thanks to ikegami for a comment.

Community
  • 1
  • 1
zdim
  • 64,580
  • 5
  • 52
  • 81
3

\A and \z always match the beginning and the end of the string, respectively.

       without /m              with /m

\A     Beginning of string     Beginning of string
^      \A                      \A|(?<=\n)

\z     End of string           End of string
\Z     \z|(?=\n\z)             \z|(?=\n\z)
$      \z|(?=\n\z)             \z|(?=\n)

Put differently,

┌─────────────────── `\A` and `^`
│     ┌───────────── `(?m:$)`
│     │ ┌─────────── `(?m:^)`
│     │ │     ┌───── `\Z` and `$`
│     │ │     │ ┌─── `\z`, `\Z` and `$`
│     │ │     │ │
F o o ␊ B a r ␊

Remember, all of these matches are zero-length.

ikegami
  • 367,544
  • 15
  • 269
  • 518