Some implementations of javascript allow []
and [^]
as "no character" and "any character" respectively. But keep in mind that this is particular to the javascript regex flavour. (if your are interested by the subject you can take a look at this post.)
In other words [^]
is a shortcut for [\s\S]
since javascript doesn't have a dotall or singleline mode where the dot can match newlines.
Thus, to obtain the same result in PHP you must replace [^]
by .
(which by default matches any character except newline) with the singleline modifier s
after the end delimiter or (?s)
before the .
to allow newlines too. Examples: /.+/s
or /(?s).+/
But for your particular case this pattern seems to be more appropriate:
preg_match_all('~((?>[^rn\\\:]++|(?<!\\\)[rn])+):([^\\\]++)~', $subject, $matches, PREG_SET_ORDER);
foreach ($matches as $match) {
echo $match[1].' '.$match[2].'<br/>';
}
pattern explanation:
~ # pattern delimiter
( # open the first capturing group
(?> # open an atomic group
[^rn\\\:]++ # all characters that are not "r", "n", "\" or ":"
| # OR
(?<!\\\)[rn] # "r" or "n" not preceded by "\"
)+ # close the atomic group and repeat one or more times
) # close the first capturing group
:
( # open the second capturing group
[^\\\]++ # all characters except "\" one or more times
) # close the second capturing group
~
Notices:
When you want to represent a \
(backslash) in a string surrounded by single quotes, you must use a double escape: \\\
The principe of this pattern is to use negative character classes and negative assertions, in other words it looks for what the desired substrings can not be.
The above pattern use atomic groups (?>...)
and possessive quantifiers ++
in place of non-capturing group (?:...)
and simple quantifiers +
. It is the same except that the regex engine can't go back to test other ways when it fails with atomic groups and possessive quantifiers, since it doesn't record backtrack positions. You can win in performance with this kind of features.