28

The . character in a php regex accepts all characters, except a newline. What can I use to accept ALL characters, including newlines?

Timothy
  • 4,630
  • 8
  • 40
  • 68
Entity
  • 7,972
  • 21
  • 79
  • 122

5 Answers5

51

This is commonly used to capture all characters:

[\s\S]

You could use any other combination of "Type-X + Non-Type-X" in the same way:

[\d\D]
[\w\W]

but [\s\S] is recognized by convention as a shorthand for "really anything".

You can also use the . if you switch the regex into "dotall" (a.k.a. "single-line") mode via the "s" modifier. Sometimes that's not a viable solution (dynamic regex in a black box, for example, or if you don't want to modify the entire regex). In such cases the other alternatives do the same, no matter how the regex is configured.

Tomalak
  • 332,285
  • 67
  • 532
  • 628
11

It's the the . character that means "every character" (edit: OP edited). And you need to add the option s to your regexp, for example :

preg_match("`(.+)`s", "\n");
Vincent Savard
  • 34,979
  • 10
  • 68
  • 73
  • Aren't there supposed to be forward slashes at the beginning and end of a regexp? – Entity Oct 26 '10 at 17:29
  • 5
    Can be, but any pair of delimiters will do. – Tim Pietzcker Oct 26 '10 at 17:31
  • Not in PHP. It has to start and end with a delimiter (you can choose it), and every character past the last delimiter is an option (i.e. U for ungreedy, i for case-insensitive, etc.) – Vincent Savard Oct 26 '10 at 17:31
  • +1 Depending on your needs `m` is an option as well. But based on the OP, `s` is the way to go. – Jason McCreary Oct 26 '10 at 17:33
  • 1
    Someone should explain `s` (and perhaps `m`) to make this really complete. – Buttle Butkus Sep 12 '13 at 02:55
  • `/m` makes `^` and `$` apply to each line (instead of full string), `/s` makes `.` also match `\n`, and `/ms` applies both (check each line and full string). See [PHP manual regarding modifiers](http://php.net/manual/en/reference.pcre.pattern.modifiers.php) and [Perl RegEx manual](http://perldoc.perl.org/perlre.html) (which PHP's preg_match is based on). – Synexis Oct 08 '16 at 05:39
1

would

[.\n]+

not work?

How about (.|\n)+? I tested it and it seems to work.

I am quite sure this is the literal interpretation of exactly what you were asking for.

gnomed
  • 5,483
  • 2
  • 26
  • 28
  • The `.` in a character class does not mean "any character". It means "a dot". Character classes have their own syntax. ;-) – Tomalak Oct 26 '10 at 17:38
  • @Tomalak: Thanks for the explanation, I just realized it now. I guess I should test my answers before I post them. I've edited my answer now. – gnomed Oct 26 '10 at 17:43
  • Common error. I see people do `[this|that|\d]` a lot, when they really mean `(this|that|\d)`. *P.S.: `(.|\n)` works but it may be slightly less efficient than a character class.* – Tomalak Oct 26 '10 at 17:50
  • Glad all I had was some metacharacter confusion. Dont think I would ever try to put an "|" inside "[]" I just like to avoid "()" whenever possible because they also are used to initialize special variables in Perl(and other languages) when something inside them matches. – gnomed Oct 26 '10 at 17:55
  • I think a problem with this approach is that you "hardwire" the set of characters. If one day, one invents a character that is not matched by `.` (already the case: `\t`), one needs to *rewrite* all libraries that were based on such assumption... – Willem Van Onsem Oct 09 '14 at 19:03
  • @WillemVanOnsem `\t` is matched by `.`. – steffen May 14 '17 at 08:04
0

The PHP Manual page for Dot states that:

If the PCRE_DOTALL option is set, then dots match newlines as well.

Franco
  • 669
  • 2
  • 8
  • 23
0

An important thing is missing here. [\s\S] matches one character, whereas a newline can be a character sequence. (Windows uses two characters: \r\n.) Neither . (with DOT_ALL modifier) nor [\s\S] will match the newline sequence. Best way to match any character or any newline is (.|\R), "everything except a newline or a newline". \R matches \n, \r and \r\n.

steffen
  • 16,138
  • 4
  • 42
  • 81