2

I researched for a while but surprisingly none of the methods or regular expressions I found worked properly.

I need a method that removes all kinds of single and multi-line comments from a source code file.

Various regular expressions such as

sourceCode.replaceAll("(/\\*([^*]|[\\r\\n]|(\\*+([^*/]|[\\r\\n])))*\\*+/|[ \\t]*//.*)", "");

I tried resulted in an exception:

Exception in thread "main" java.lang.StackOverflowError

Then I also found solutions such as this one which worked well but still had a few comment characters floating around in the processed source code which shouldn't happen.

Another method such as this one worked almost perfectly but it failed with comments of the form /*// Hi */ and totally ignored those blocks.

I literally got a different result from each regex I tried. Let me know please how to reliably accomplish this task.

Community
  • 1
  • 1
BullyWiiPlaza
  • 17,329
  • 10
  • 113
  • 185
  • Possible Dup. Check solutions here: http://stackoverflow.com/questions/9078528/tool-to-remove-javadoc-comments – ANooBee Feb 10 '16 at 19:41
  • Are you certain regexes can do it at all? – Louis Wasserman Feb 10 '16 at 19:42
  • @ANooBee: This is for regular comments, not specific to Javadoc. The regular expression that was just posted as a comment (and now deleted) actually worked fine for multi-line comments. That's all I needed to be honest: `sourceCode.replaceAll("/\\*[^*]*\\*++(?:[^/*][^*]*\\*++)*/", "");` – BullyWiiPlaza Feb 10 '16 at 19:55

1 Answers1

2

Here's a simplified version from my answer on JavaScript comment removal:

Replace:

(?m)((["'])(?:\\.|.)*?\2)|//.*?$|/\*[\s\S]*?\*/

With $1.

Demo here

The answer I linked to explains in detail how this pattern works. The reason this one is simpler is because Java doesn't have regex literals in the language syntax. Those really make the replacement nasty.

Community
  • 1
  • 1
Lucas Trzesniewski
  • 50,214
  • 11
  • 107
  • 158