3

I want to extract all method calls from java code. I have written following two regular expression but they are not able to extract all the method calls.

Reg1 : Pattern.compile("([a-zA-Z][0-9_a-zA-Z]*\\([a-zA-Z0-9_\\s,\\[\\]\\(\\)\\.]+\\))");

Reg2 : Pattern.compile("([a-zA-Z][0-9_a-zA-Z]*\\([\\s]*\\))")

Input:

"{
     if ((war == null) && (config != null)) {
    sb.append( &config= );
    sb.append(URLEncoder.encode(config,getCharset()));
    }
    if ((war == null) && (localWar != null)) {
    sb.append( &war= );
    sb.append(URLEncoder.encode(localWar,getCharset()));
    }
    if (update) {
    sb.append( &update=true );
    }
    if (tag != null) {
      sb.append( &tag= );
      sb.append(URLEncoder.encode(tag,getCharset()));
     }
     }"

output:

getCharset getCharset getCharset append append append

I am not able to extract "encode".

Does anyone have any idea as an what should I add to regular expression?

nhahtdh
  • 55,989
  • 15
  • 126
  • 162
Sangeeta
  • 589
  • 1
  • 7
  • 26
  • 5
    This is (according to well-established principles of language theory) impossible to do using regular expressions, mainly because each call may contain calls that may contain calls that ... – laune Sep 17 '15 at 05:02
  • Please suggest me some alternative of it. – Sangeeta Sep 17 '15 at 05:06
  • 2
    Perhaps this post is of any help http://stackoverflow.com/questions/2206065/java-parse-java-source-code-extract-methods – Mariano Sep 17 '15 at 05:19
  • Thanks! but I want to extract method calls made in the program. – Sangeeta Sep 17 '15 at 05:24
  • 1
    Same goes for Java: http://stackoverflow.com/questions/6751105/why-its-not-possible-to-use-regex-to-parse-html-xml-a-formal-explanation-in-la – Nir Alfasi Sep 17 '15 at 06:56

1 Answers1

9

You need a Java Code Parser for this task. Here is an example which uses Java Parser:

public class MethodCallPrinter
{
    public static void main(String[] args) throws Exception
    {
        FileInputStream in = new FileInputStream("MethodCallPrinter.java");

        CompilationUnit cu;
        try
        {
            cu = JavaParser.parse(in);
        }
        finally
        {
            in.close();
        }
        new MethodVisitor().visit(cu, null);
    }

    private static class MethodVisitor extends VoidVisitorAdapter
    {
        @Override
        public void visit(MethodCallExpr methodCall, Object arg)
        {
            System.out.print("Method call: " + methodCall.getName() + "\n");
            List<Expression> args = methodCall.getArgs();
            if (args != null)
                handleExpressions(args);
        }

        private void handleExpressions(List<Expression> expressions)
        {
            for (Expression expr : expressions)
            {
                if (expr instanceof MethodCallExpr)
                    visit((MethodCallExpr) expr, null);
                else if (expr instanceof BinaryExpr)
                {
                    BinaryExpr binExpr = (BinaryExpr)expr;
                    handleExpressions(Arrays.asList(binExpr.getLeft(), binExpr.getRight()));
                }
            }
        }
    }
}

Output:

Method call: parse
Method call: close
Method call: visit
Method call: print
Method call: getName
Method call: getArgs
Method call: handleExpressions
Method call: visit
Method call: handleExpressions
Method call: asList
Method call: getLeft
Method call: getRight
Koray Tugay
  • 22,894
  • 45
  • 188
  • 319
splash
  • 13,037
  • 1
  • 44
  • 67
  • The main difficulty is to handle recursions in the sentences. You might add this to your answer, e.g., another visitor for expressions. – laune Sep 17 '15 at 09:00
  • @laune what do you mean with "recursions in the sentences"? – splash Sep 17 '15 at 12:05
  • 1
    The Java grammar defines how "sentences" of the language are to be constructed. In the definition of "method call", via some NTs in between, "method call" appears again: recursion. - So there must be an iteration `for( Expression e: methodCall.getArgs() ){ ... }` etc. – laune Sep 17 '15 at 12:32
  • @laune Understand. I overlooked that. Thought that Java Parser would handle this gracefully. :-) – splash Sep 17 '15 at 13:24
  • This should be sufficient to "show the way". Although, seeing the many subclasses of Expression, there may be other classes where handling is required, perhaps UnaryExpression or ConditionalExpression (if there are such subclasses - I'm working from memory right now). – laune Sep 17 '15 at 16:39
  • I want to parse part of the source code, not the complete file. Is it possible to parse fragments of code. – Sangeeta Sep 18 '15 at 08:53
  • 1
    @Sangeeta you could use `methodCall.getBeginLine()` to check if the method call is inside of your desired part of the source code. – splash Sep 19 '15 at 11:43
  • @splash, when I am creating parser using part of the code it is giving error. Because it is not complete as sometimes ;, ), } or while in do-while loop is missing. Any ideas how to handle incomplete code. – Sangeeta Sep 25 '15 at 09:47
  • @Sangeeta you should alway parse valid java code. If you only have a part of the code then you must complete the code so it is valid and ignore the added code lines in the `MethodVisitor`. – splash Sep 25 '15 at 11:02