1

I am trying to extract all the functions I have in a file using REGEX. Here's a standard file example:

int main()
{
    printf("hello to all the good people");
    printf("hello to all the good people %d ", GetLastError());

    for(int i =0; i<15; i++)
    {
        if(i == 5)
        {
            switch(i)
            {
                case 0:
                    break; 
            }
        }
    }
}

In the meantime, I have only succeeded in capturing functions using the following REGEX:

regex = re.findall('\w+\s*?[(].*[)]', _content) #'\w+\s*?[(]*[)]'
for i in regex:
    print i

My problems are:

  1. How can I tell him to ignore things like FOR or SWITCH?
  2. How do I tell him to find an internal function inside an externally as an example:

printf ("%s", get_string());

  1. How do I make it not to relate to () that are between quotes as () that aren't between quotes (so if i have line: printf("hello to j. (and rona) %s", get_family_name()); he will know to extract:

    foo name: parameters: printf "hello to j. (and rona) %s", get_family_name() get_family_name none

  • 1
    do you need to use regex? – depperm Apr 26 '18 at 15:48
  • you can filter out for and switch after if you find a regex that works, though I think creating a parser would work better – depperm Apr 26 '18 at 15:55
  • i think that regex is the best tool for this kind of mission , it's just filtering text... – Yoram Abargel Apr 26 '18 at 15:56
  • You are wrong, Yoram. While I am a big fan of regexes they are the wrong tool for parsing things like HTML or computer languages. A regex is often good enough when searching for a function definition inside your editor but is not robust enough for any non-interactive use. – Kurtis Rader Apr 27 '18 at 02:48

1 Answers1

1

You cannot parse C using regular expressions.

There is another question about parsing HTML with regex; the answer given there applies also to C, and to essentially any useful programming language.

The pycparser library looks like it might be useful, particularly the func_calls example – in fact, I think the following snippet (adapted from that example) will do exactly what you want, although I haven't tested it:

from pycparser import c_ast, parse_file

class FuncCallVisitor(c_ast.NodeVisitor):
    def visit_FuncCall(self, node):
        print("{} called at {}".format(node.name.name, node.name.coord))

ast = parse_file("myfile.c", use_cpp=True)
v = FuncCallVisitor()
v.visit(ast)
ash
  • 5,139
  • 2
  • 27
  • 39