You're looking for the token spelling
and location
attributes exposed by libclang. In C++ these can be retrieved using the functions clang_getTokenLocation and clang_getTokenSpelling. A minimal use of these functions (using their python equivalents would be:
s = '''
struct Type;
void foo(Type param);
'''
idx = clang.cindex.Index.create()
tu = idx.parse('tmp.cpp', args=['-std=c++11'], unsaved_files=[('tmp.cpp', s)], options=0)
for t in tu.get_tokens(extent=tu.cursor.extent):
print t.kind, t.spelling, t.location
Gives:
TokenKind.KEYWORD struct <SourceLocation file 'tmp.cpp', line 2, column 1>
TokenKind.IDENTIFIER Type <SourceLocation file 'tmp.cpp', line 2, column 8>
TokenKind.PUNCTUATION ; <SourceLocation file 'tmp.cpp', line 2, column 12>
TokenKind.KEYWORD void <SourceLocation file 'tmp.cpp', line 3, column 1>
TokenKind.IDENTIFIER foo <SourceLocation file 'tmp.cpp', line 3, column 6>
TokenKind.PUNCTUATION ( <SourceLocation file 'tmp.cpp', line 3, column 9>
TokenKind.IDENTIFIER Type <SourceLocation file 'tmp.cpp', line 3, column 10>
TokenKind.IDENTIFIER param <SourceLocation file 'tmp.cpp', line 3, column 15>
TokenKind.PUNCTUATION ) <SourceLocation file 'tmp.cpp', line 3, column 20>
TokenKind.PUNCTUATION ; <SourceLocation file 'tmp.cpp', line 3, column 21>