330

Is there a regex to match "all characters including newlines"?

For example, in the regex below, there is no output from $2 because (.+?) doesn't include new lines when matching.

$string = "START Curabitur mollis, dolor ut rutrum consequat, arcu nisl ultrices diam, adipiscing aliquam ipsum metus id velit. Aenean vestibulum gravida felis, quis bibendum nisl euismod ut. 

Nunc at orci sed quam pharetra congue. Nulla a justo vitae diam eleifend dictum. Maecenas egestas ipsum elementum dui sollicitudin tempus. Donec bibendum cursus nisi, vitae convallis ante ornare a. Curabitur libero lorem, semper sit amet cursus at, cursus id purus. Cras varius metus eu diam vulputate vel elementum mauris tempor. 

Morbi tristique interdum libero, eu pulvinar elit fringilla vel. Curabitur fringilla bibendum urna, ullamcorper placerat quam fermentum id. Nunc aliquam, nunc sit amet bibendum lacinia, magna massa auctor enim, nec dictum sapien eros in arcu. 

Pellentesque viverra ullamcorper lectus, a facilisis ipsum tempus et. Nulla mi enim, interdum at imperdiet eget, bibendum nec END";

$string =~ /(START)(.+?)(END)/;

print $2;
kurotsuki
  • 4,357
  • 8
  • 25
  • 28

7 Answers7

449

If you don't want add the /s regex modifier (perhaps you still want . to retain its original meaning elsewhere in the regex), you may also use a character class. One possibility:

[\S\s]

a character which is not a space or is a space. In other words, any character.

You can also change modifiers locally in a small part of the regex, like so:

(?s:.)
John Smith
  • 7,243
  • 6
  • 49
  • 61
ephemient
  • 198,619
  • 38
  • 280
  • 391
252

Add the s modifier to your regex to cause . to match newlines:

$string =~ /(START)(.+?)(END)/s;
BoltClock
  • 700,868
  • 160
  • 1,392
  • 1,356
  • 42
    In JavaScript: (START)[\s\S]*(END) - See www.regexpal.com to test – Zymotik Jul 15 '14 at 15:40
  • 1
    For more info regarding @Zymotik's comment, see: http://stackoverflow.com/questions/1068280/javascript-regex-multiline-flag-doesnt-work – Jacob van Lingen Jul 19 '16 at 07:14
  • 3
    In Java you can use the inline modifier (?s) at the beginning of the regex, for example to replace any character including newlines after 'yourPattern' use `"(?s)yourPattern.*"`- Also see: https://www.rexegg.com/regex-modifiers.html#dotall – LukeSolar Aug 07 '19 at 13:40
  • In Ruby, the modifier is `m`, not `s`. See: https://rubular.com/ – Jon Schneider Jan 13 '20 at 23:17
  • JavaScript now supports this way. ES2018 added the `s` [dotAll](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/dotAll) flag. – Lyall Sep 28 '20 at 03:56
17

This is very readable to me and matches "any character or newline"

(.|\n)*

It behaves the same as

[\S\s]*

and the same as

(?s:.)*

Plus you can also add a ? to the end to make the regex eager (stop on the first match) (.|\n)*?

// Eager (stop on first match)
start_string(.|\n)*?end_string

Otherwise with only (.|\n)* the regex is greedy and you can end up with multiple end_string's:

start_string some text
and newlines end_string
some more text end_string
Julesezaar
  • 2,658
  • 1
  • 21
  • 21
10

Yeap, you just need to make . match newline :

$string =~ /(START)(.+?)(END)/s;
FailedDev
  • 26,680
  • 9
  • 53
  • 73
0

I like to use an empty negated set which matches any character not in the group, since it's empty it will match anything including newlines.

[^]

If you want more than zero characters

[^]*

Or more than one

[^]+

Tested in JavaScript.

Ayo Reis
  • 255
  • 3
  • 14
  • Not sure about this. What specific regex engine implementation are you using? I don't think this notation has a conventional or widely-adopted meaning. Notepad++, for example, rejects this expression as malformed. One problem is that, if the engine can't assume there is at least one character in the (negated) set, then you'd have to establish another escape sequence in order to negate the set of a single `']'` character. – Glenn Slayden Aug 05 '23 at 03:58
  • I'm using Chrome (V8), if I paste `/[^]*/.test('whatever')` in the console it return `true`. – Ayo Reis Aug 07 '23 at 12:08
-1

Go with the other answers that use the /s flag to let the . match every character in

Perl v5.12 added the \N as a character class shortcut to always match any character except a newline despite the setting of /s. This allows \n to have a partner like \s has \S.

With this, you can do like similar answers to use both sides of the complement: [\n\N], [\s\S], and so on.

However, you've also tagged this with javascript, which thinks \N is just capital N.

brian d foy
  • 129,424
  • 31
  • 207
  • 592
-6

You want to use "multiline".

$string =~ /(START)(.+?)(END)/m;
BoltClock
  • 700,868
  • 160
  • 1,392
  • 1,356
nadime
  • 145
  • 6