I need a C# regex to delete everything between /*
and */
including the /**/
.
So, basically remove all code comments in the given text.
Asked
Active
Viewed 5,351 times
6
-
1you really don't need a regex for that. – Brian Driscoll May 26 '11 at 12:16
-
So what is the question? – Renatas M. May 26 '11 at 12:19
-
1That is not that easy. Your code may contain strings like "This: /* boo */ is no comment". – Jens May 26 '11 at 12:31
-
1Or commented comments: `// no comment here /*`, followed by `WillBeRemoved(); /* real comment */`. Ok, not too common, but you can get very creative with messing this up. – Kobi May 26 '11 at 12:46
-
3C# is not a *regular language*, so it is impossible to recognize it correctly with a *regular expression*. If you want to remove comments correctly then what you have to build is a *lexer*. Break the text up into tokens and identify which tokens are comments. – Eric Lippert May 26 '11 at 15:18
-
Why on earth would you want to remove comments in a piece of code. Please do not make programmers in general look stupid by actually doing this. – Security Hound May 26 '11 at 18:04
-
2@Eric - although they are certainly not the right tool for this job, .NET regular expressions are not limited to recognizing regular languages (e.g. see http://msdn.microsoft.com/en-us/library/bs2twtah.aspx#balancing_group_definition). – kvb May 26 '11 at 19:34
-
1@Ramhound: There are lots of reasons to remove comments. For example, when compressing code that is going to be delivered over a highly performance-sensitive channel where it's not going to be read by humans on the other end. – Eric Lippert May 26 '11 at 20:28
3 Answers
6
Should be something like this:
var regex = new Regex("/\*((?!\*/).)*\*/", RegexOptions.Singleline);
regex.Replace(input, "");

petho
- 677
- 4
- 10
2
Be wary that comments can be nested. If comments can be nested like in SQL, the basic regex is going to look like this:
/\*.*?\*/
You'll then need to loop until you're stripping nothing.
If, by contrast, comments end on the first */ like in C, you need it greedy with a negative lookahead:
/\*((?!\*/).)*\*/

Denis de Bernardy
- 75,850
- 13
- 131
- 154
0
I was also needing to ignore lines comments with the form
// blablabla
So, just for if someone also need this, modify the regex by adding the last part |(//.*) so the complete form will be:
(/\*([^*]|[\r\n]|(\*+([^*/]|[\r\n])))*\*+/)|(//.*)

Nachokhan
- 81
- 7