14

If I have the following macro in some C++ code:

_Foo(arg1, arg2)

I would like to use Python to find me all the instances and extents of that macro using Clang and the Python bindings provided with cindex.py. I do not want to use a regular expression from Python on the code directly because that gets me 99% of the way there, but not 100%. It appears to me that to get to 100%, you need to use a real C++ parser like Clang to handle all the cases where people do silly things that are syntactically correct and compile, but don't make sense to a regular expression. I need to handle 100% of the cases and since we use Clang as one of our compilers, it makes sense to use it as the parser for this task as well.

Given the following Python code I am able to find what appear to be predefined types that the Clang python bindings know about, but not macros:

def find_typerefs(node):
    ref_node = clang.cindex.Cursor_ref(node)
    if ref_node:
        print 'Found %s Type %s DATA %s Extent %s [line=%s, col=%s]' % (
            ref_node.spelling, ref_node.kind, node.data, node.extent, node.location.line, node.location.column)

# Recurse for children of this node
for c in node.get_children():
    find_typerefs(c)

index = clang.cindex.Index.create()
tu = index.parse(sys.argv[1])
find_typerefs(tu.cursor)

What I think I am looking for is a way to parse the raw AST for the name of my macro _FOO(), but I am not sure. Can someone provide some code that will allow me to pass in the name of a Macro and get back the extent or data from Clang?

askewchan
  • 45,161
  • 17
  • 118
  • 134
warmbeach
  • 308
  • 2
  • 8
  • 4
    I haven't used Clang but I don't think you'll find your macro in the AST. If you look at the stages of C++ compilation (http://stackoverflow.com/questions/8833524/what-are-the-stages-of-compilation-of-a-c-program) macros "disappear" in the preprocessing portion which is completed before compilation and the AST is generated. At this point your macro doesn't exist as its been replaced entirely by the macro contents. I would look into the preprocessing portion of Clang and see what you can get from that. – uesp Apr 11 '12 at 21:06
  • 5
    @uesp Well, Clang is not just a compiler. It's a *great* compiler that tries hard to provide *great* diagnostics. That's why macros *are* kept track of in Clang (check the [class list](http://clang.llvm.org/doxygen/annotated.html) for occurences of "macro"), to some degree. I'd be very surprised if that caught even the foulest syntax-defying macros, but I think it's very much possible for function-like macros such as OP's. –  Apr 11 '12 at 21:12
  • I guess Clang (as a sane compiler) has command to just _preprocess_ the source code... – Griwes Apr 11 '12 at 21:18

2 Answers2

10

You need to pass the appropriate options flag to Index.parse:

tu = index.parse(sys.argv[1], options=clang.cindex.TranslationUnit.PARSE_DETAILED_PROCESSING_RECORD)

The rest of the cursor visitor could look like this:

def visit(node):
    if node.kind in (clang.cindex.CursorKind.MACRO_INSTANTIATION, clang.cindex.CursorKind.MACRO_DEFINITION):
        print 'Found %s Type %s DATA %s Extent %s [line=%s, col=%s]' % (node.displayname, node.kind, node.data, node.extent, node.location.line, node.location.column)
    for c in node.get_children():
        visit(c)
thpani
  • 403
  • 6
  • 15
0

I once wrote a script to prettyprint the whole AST you get from libclang, in order to see where to find which information.

Here it is: https://gist.github.com/2503232

Sebastian
  • 1,839
  • 12
  • 16