To avoid the string trap, a way consists to match what you want to avoid first and to capture it or to skip it.
ereg_
functions are deprecated since PHP 5.3, however it's always possible to use them:
$result = ereg_replace('("([^\\\"]|\\\.)*")|//[^' . "\n" . ']*|/\*\**([^*]|\*\**[^*/])*\*\**/', '\1', $str);
It works but the performances are very poor if you compare with the preg version (that has a lot of features to improve the pattern):
$pattern2 = '~
" [^"\\\]* (?s: \\\. [^"\\\]* )*+ " # double quoted string
(*SKIP)(*F) # forces the pattern to fail and skips the previous substring
|
/
(?:
/ .* # singleline comment
|
\* # multiline comment
[^*]* (?: \*+(?!/) [^*]* )*+
(?: \*/ )? # optional to deal with unclosed comments
)
~xS';
$result = preg_replace($pattern2, '', $str);
online demo
The preg version is about 450x faster than the ereg_ version.
details of the subpattern [^*]* (?: \*+(?!/) [^*]* )*+
:
This subpattern describes the content of a multiline comment, so all between /*
and the first */
:
[^*]* # all that is not an asterisk (can be empty)
(?: # open a non capturing group:
# The reason of this group is to handle asterisks that
# are not a part of the closing sequence */
\*+ # one or more asterisks
(?!/) # negative lookahead : not followed by /
# (it is a zero-width assertion, in other words it's only a test
# and it doesn't consume characters)
[^*]* # zero or more characters that are not an asterisk
)*+ # repeat the group zero or more times (possessive)
Regex engine walk (about) for the string /*aaaa**bbb***cc***/
:
/*
aaaa**bbb***cc***/
/\*
[^*]* (?: \*+(?!/) [^*]* )*+ \*/
succeed
/*
aaaa
**bbb***cc***/
/\*
[^*]*
(?: \*+(?!/) [^*]* )*+ \*/
succeed
/*aaaa**bbb***cc***/
/\* [^*]*
(?: \*+(?!/) [^*]* )*+
\*/
try group
/*aaaa
**
bbb***cc***/
/\* [^*]* (?:
\*+
(?!/) [^*]* )*+ \*/
succeed
/*aaaa**
b
bb***cc***/
/\* [^*]* (?: \*+
(?!/)
[^*]* )*+ \*/
verified
/*aaaa**
bbb
***cc***/
/\* [^*]* (?: \*+(?!/)
[^*]*
)*+ \*/
succeed
/*aaaa**bbb***cc***/
/\* [^*]*
(?: \*+(?!/) [^*]* )*+
\*/
try group
/*aaaa**bbb
***
cc***/
/\* [^*]* (?:
\*+
(?!/) [^*]* )*+ \*/
succeed
/*aaaa**bbb***
c
c***/
/\* [^*]* (?: \*+
(?!/)
[^*]* )*+ \*/
verified
/*aaaa**bbb***
cc
***/
/\* [^*]* (?: \*+(?!/)
[^*]*
)*+ \*/
succeed
/*aaaa**bbb***cc***/
/\* [^*]*
(?: \*+(?!/) [^*]* )*+
\*/
try group
/*aaaa**bbb***cc
***
/
/\* [^*]* (?:
\*+
(?!/) [^*]* )*+ \*/
succeed
/*aaaa**bbb***cc***
/
/\* [^*]* (?: \*+
(?!/)
[^*]* )*+ \*/
fail
/*aaaa**bbb***cc
**
*/
/\* [^*]* (?:
\*+
(?!/) [^*]* )*+ \*/
backtrack
/*aaaa**bbb***cc**
*
/
/\* [^*]* (?: \*+
(?!/)
[^*]* )*+ \*/
verified
/*aaaa**bbb***cc***/
/\* [^*]* (?: \*+(?!/)
[^*]*
)*+ \*/
succeed
/*aaaa**bbb***cc***/
/\* [^*]*
(?: \*+(?!/) [^*]* )*+
\*/
try group
/*aaaa**bbb***cc**
*
/
/\* [^*]* (?:
\*+
(?!/) [^*]* )*+ \*/
succeed
/*aaaa**bbb***cc***
/
/\* [^*]* (?: \*+
(?!/)
[^*]* )*+ \*/
fail
/*aaaa**bbb***cc***/
/\* [^*]*
(?: \*+(?!/) [^*]* )*+
\*/
fail
/*aaaa**bbb***
cc**
*/
/\* [^*]*
(?: \*+(?!/) [^*]* )*+
\*/
backtrack
/*aaaa**bbb***cc**
*/
/\* [^*]* (?: \*+(?!/) [^*]* )*+
\*/
succeed