I believe I have a new best pattern for you.
/\/\*[\s\S]*?\*\/|(['"])[\s\S]+?\1(*SKIP)(*FAIL)|\/{2}.*/
This will accurately process the following block of text in just 683 steps:
<?
/* This is a comment */
cout << "Hello World"; // prints Hello World
/*
* C++ comments can also
*/
cout << "Hello World";
/* Comment out printing of Hello World:
cout << "Hello World"; // prints Hello World
*/
echo "//This line was not a Comment, but ... ";
echo "http://stackoverflow.com";
echo 'http://stackoverflow.com/you can not match this line';
array = ['//', 'no, you can not match this line!!']
/* This is * //a comment */
Pattern Explanation: (Demo *you can use the Substitution box at the bottom to replace the comment substrings with an empty string -- effectively removing all comments.)
/\/\*[\s\S]*?\*\/
Match \*
then 0 or more characters then */
|
OR
(['"])[\s\S]*?\1(*SKIP)(*FAIL)
Don't match '
or "
then 1 or more characters then the leading (captured) character
|
OR
\/{2}.*/
Match //
then zero or more non-newline characters
Using [\s\S]
is like .
except it allows newline characters, this is deliberately used in the first two alternatives. The third alternative intentionally uses .
to stop when a newline character is found.
I have checked every sequence of alternatives, to ensure that the fastest alternatives come first and the pattern is optimized. My pattern correctly matches the OP's sample input. If anyone finds an issue with my pattern, please leave me a comment so that I can try to fix it.
Jan's pattern correctly matches all of the OP's desired substrings in 1006 steps using: ~([\'\"])(?<!\\).*?\1(*SKIP)(*FAIL)|(?|(?P<comment>(?s)\/\*.*?\*\/(?-s))|(?P<comment>\/\/.+))~gx
Sahil's pattern fails to completely match the final comment in your UPDATED sample input. This means either the question is wrong and should be closed as "unclear what you are asking", or Sahil's answer is wrong and it should not be awarded the green tick. When you updated your question, you should have requested that Sahil update his answer. When incorrect answers fail to satisfy the question, future SO readers are likely to become confused and SO becomes a less reliable resource.