0

I want to parse a directory tree, find each *.cpp then eviscerate the functions, leaving me with mocks.

Parsing the tree is no problem. Evisceration is more difficult.

I am currently reading the source file into a string and looping over it character by character. If I see a closed round bracket ) and the next non-whitespace character is an opening brace { then I have a function start.

Then I can stop writing output, counting opening and closing braces as I go, until I get to the matching end brace } at the functions end.

The code is horrible and buggy and in constant flux, so hardly worth posting.

Is there an elegant solution, probably involving regex, which will remove the body of all functions in a file, leaving the rest unchanged?

Bonus if it can detect the function's type & generate a return statement, but I can figure that out myself if need be.

Mawg says reinstate Monica
  • 38,334
  • 103
  • 306
  • 551
  • 1
    "eviscerate the functions", what a program! – Casimir et Hippolyte Feb 24 '15 at 11:06
  • 1
    Regexes suck at nested blocks like `{}` and `()`. Relevant question: http://stackoverflow.com/questions/1444961/is-there-a-good-python-library-that-can-parse-c – Igor Hatarist Feb 24 '15 at 11:07
  • 2
    C++ is not a regular language, but a context-free language. You need a proper parser for it, regular expressions will make your troubles bigger. – Lachezar Feb 24 '15 at 11:09
  • It's possible (and fast with a well designed pattern) if you use the new regex module that handles recursion. But you can't do it with the re module. Have you searched if an C++ parser module already exists or have you tried a lexer module? However improving the code you have already written can be the way. – Casimir et Hippolyte Feb 24 '15 at 11:13
  • I doubt it is possible. A simple example how you can make things hairy is by putting one of the functions in a multiline comment `/* */`. A simple regex will match it, but a parser will omit comments. – Lachezar Feb 24 '15 at 11:27
  • 1
    @Lucho: it's easy to avoid this as to avoid something enclosed in a string. All that you need is to match them before. – Casimir et Hippolyte Feb 24 '15 at 11:31

1 Answers1

2

You can use a parser, clang api provides it and you've python bindings: https://github.com/llvm-mirror/clang/tree/master/bindings/python

This article can give you some insights of how to use it: http://szelei.me/code-generator/

There's also some wrappers so can make your job easier, like this one: https://github.com/sztomi/cmonster

dfranca
  • 5,156
  • 2
  • 32
  • 60