0

I need to write a Java method

public static String removeComments(String code);

that removes all Java comments from a String. I know that questions like these have been asked before but:

  • Using complicated regex-Expressions for multi-line comments like text = text.replaceAll("(?:/\\*(?:[^*]|(?:\\*+[^*/]))*\\*+/)|(?://.*)", " "); works fine in small tests, but often throws Stackoverflows in my real-world cases (the | in regex leads to branching).

  • Many people posted some "simple" functions they came up with where you can easily construct counter examples (like having // in a String literal, e.g. in a URL).

So is there a reliable, non-recursive way of solving this problem?

J Fabian Meier
  • 33,516
  • 10
  • 64
  • 142
  • What do you mean with "non-recursive"? Why would you care whether it is recursive, why would you exclude a whole class of solutions? – Erwin Bolwidt Jun 07 '16 at 07:37
  • Well, the answer to your question is [Yes!](http://stackoverflow.com/questions/931762/can-every-recursion-be-converted-into-iteration), but the *how* is the difficult part. – Idos Jun 07 '16 at 07:37
  • @ErwinBolwidt Ok, I don't mean non-recursive, but just not excessively recursive like in the regex which kills my stack. – J Fabian Meier Jun 07 '16 at 07:38
  • Don't use a regex for this. Iterate through character-by-character, looking for the start of comments, end of comments, start of string literals etc. – Andy Turner Jun 07 '16 at 07:38
  • I would definitely prefer that, but crafting that yourself is difficult because I probably forget some strange special cases (and I need to process a huge amount of source-code, I cannot check that by hand afterwards). – J Fabian Meier Jun 07 '16 at 07:43

0 Answers0