0

I'm trying to write a regex that matches java doc comments within java files and my code below doesn't seem to be working.

I'm specifically trying to match only the initial javadoc comments prior to the first occurrence of public class blah {

The code below finds the JAVA_COMMENT_OPEN_TAG without any issues but has trouble finding the JAVA_COMMENT_CLOSE_TAG tag:

public class JavaDocParser {

  private String JAVA_COMMENT_OPEN_TAG = "^/\\*\\**+";
  private String JAVA_COMMENT_CLOSE_TAG = "[.]+\\*{1}+/{1}$";
  private StringBuilder javaDocComment = new StringBuilder();

  public JavaDocParser(File javaFile) throws TestException {
    parseJavaDocHeader(javaFile); 
    printJavaDocComment();
  }

  private void parseJavaDocHeader(File javaFile) throws TestException {
    BufferedReader br = null;

    Pattern openPattern = Pattern.compile(JAVA_COMMENT_OPEN_TAG);
    Pattern closePattern = Pattern.compile(JAVA_COMMENT_CLOSE_TAG);

    boolean openTagFound = false;
    boolean closeTagFound = false;

    try {

      br = new BufferedReader(new FileReader(javaFile));

      String line;
      while((line = br.readLine()) != null) {
        Matcher openMatcher = openPattern.matcher(line);
        Matcher closeMatcher = closePattern.matcher(line);

        if(openMatcher.matches()) {
          System.out.println("OPEN TAG FOUND ON LINE: ====> " + line);
          openTagFound = true;
        }

        if(closeMatcher.matches()) {
          System.out.println("CLOSE TAG FOUND ON LINE: ====> " + line);
          closeTagFound = true;
        }

        if(openTagFound) {
          addToStringBuilder(line);
        } else if(closeTagFound) {
          break;
        }
      }
    } catch (FileNotFoundException e) {
      throw new TestException("The " + javaFile.getName() +" file provided could not be found.  Check the file and try again.");
        } catch (IOException e) {
      throw new TestException("A problem was encountered while reading the .java file");
        } finally {
      try {
            if(br != null) { br.close(); }
        } catch (IOException e) {
      e.printStackTrace();
          }
    }
  }

  private void addToStringBuilder(String stringToAdd) {
    javaDocComment.append(stringToAdd + "\n");
  }

  public String getJavaDocComment() { return javaDocComment.toString(); }

  public void printJavaDocComment() { System.out.println(javaDocComment.toString()); }
}
nullByteMe
  • 6,141
  • 13
  • 62
  • 99
  • @JAL thanks that question helps a lot. I need to reword my question because I'm trying to do something a bit more specific. – nullByteMe May 22 '15 at 18:17
  • 1
    Also in general regex is not a replacement for a parser. http://stackoverflow.com/questions/11905506/regular-expression-vs-string-parsing – markspace May 22 '15 at 18:20
  • @markspace thanks, that brings up a point I didn't give too much thought to. I guess I considered a regex in the event that there are inconsistencies in the way people open and close their javadoc comments. – nullByteMe May 22 '15 at 18:21

2 Answers2

2

You can use the following:

private String JAVA_COMMENT_OPEN_TAG = "^/\*\*";
private String JAVA_COMMENT_CLOSE_TAG = ".*?\*+/$";

See DEMO

karthik manchala
  • 13,492
  • 1
  • 31
  • 55
0

In your regex you include the beginning and end of line patterns ^ and $. But you could have whitespaces before or after the comments

Maybe it would be safer to use:

private String JAVA_COMMENT_OPEN_TAG = "^\\s*/\\*\\**+";
private String JAVA_COMMENT_CLOSE_TAG = "[.]+\\*{1}+/{1}\\s*$";
Eric Leibenguth
  • 4,167
  • 3
  • 24
  • 51