I am new to clang-query and I need help to understand how to build ASTMatcher expressions for a code profiling tool I am trying to create. Eventually I would like to combine these AST matchers with a libTooling application to rewrite the source code.
Here is where I am havinve trouble. As a good starting point, I would like to break apart conditional compound statements into their individual sub-expressions, but I do not know how to do this and the AST matcher reference overwhelming for a newbie to clang.
Here is an example of a block of 'C' code I would like to analyze:
#include <stdbool.h>
// example code
int foo(const int* pa, int *pb) {
if (pa[0] > 1) {
pb[0] = 1;
} else if ((pa[0] == 1) && (pa[1] == 1)) {
pb[0] = 2;
} else if ((pa[0] == 0) && (pa[1] == 1)) {
pb[0] = 3;
} else if ((pa[0] == 1) && (pa[1] == 1) || (pb[0] == 2)) {
pb[0] = 4;
}
pb[0] = 2;
switch (pa[0]) {
case 0:
case 1:
break;
case 2:
break;
default:
break;
}
return 3;
}
int bar (int arg1, bool arg2) {
return arg1 + (arg2 ? 1 : 0);
}
int main(int argc, const char **argv) {
const int a[3] = { 0, 1, 2};
int b[3] = {-1,-1,-1};
foo(a, b);
bar (3, true);
}
Using clang-query, I would like to identify all the compound conditional statements in the above code but I am having a tough time doing so. From what I understand, I should be able to use the AST dump to build AST matcher expressions.
As a very simple starting point, I can identify all the if statement conditions as follows:
clang-query> match ifStmt()
Match #1:
C:\Users\johnc\main\rewriter\.\input04.c:5:5: note: "root" binds here
if (pa[0] > 1) {
^~~~~~~~~~~~~~~~
Match #2:
C:\Users\johnc\main\rewriter\.\input04.c:7:12: note: "root" binds here
} else if ((pa[0] == 1) && (pa[1] == 1)) {
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Match #3:
C:\Users\johnc\main\rewriter\.\input04.c:9:12: note: "root" binds here
} else if ((pa[0] == 0) && (pa[1] == 1)) {
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Match #4:
C:\Users\johnc\main\rewriter\.\input04.c:11:12: note: "root" binds here
} else if ((pa[0] == 1) && (pa[1] == 1) || (pb[0] == 2)) {
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
4 matches.
clang-query>
however I would also like to identify the individual sub expressions. I turned on the ast tree dump & reran the matcher as follows (I trimmed the output to just show the 4th if statement match):
This match encloses these sub expressions
((pa[0] == 1) && (pa[1] == 1) || (pb[0] == 2))
Whare are the steps I need to follow (presumably looking at AST nodes to identify:
((pa[0] == 1) && (pa[1] == 1) || (pb[0] == 2))
(pa[0] == 1)
(pa[1] == 1)
.
clang-query> enable output dump
clang-query> match ifStmt()
Match #1:
. . .
Match #2:
. . .
Match #3:
. . .
Match #4:
C:\Users\johnc\main\rewriter\.\input04.c:11:12: note: "root" binds here
} else if ((pa[0] == 1) && (pa[1] == 1) || (pb[0] == 2)) {
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Binding for "root":
IfStmt 0x13e55687480 <C:\Users\johnc\main\rewriter\.\input04.c:11:12, line:13:5>
|-BinaryOperator 0x13e55687380 <line:11:16, col:59> 'int' '||'
| |-BinaryOperator 0x13e55687260 <col:16, col:43> 'int' '&&'
| | |-ParenExpr 0x13e55687140 <col:16, col:27> 'int'
| | | `-BinaryOperator 0x13e55687120 <col:17, col:26> 'int' '=='
| | | |-ImplicitCastExpr 0x13e55687108 <col:17, col:21> 'int' <LValueToRValue>
| | | | `-ArraySubscriptExpr 0x13e556870c0 <col:17, col:21> 'const int' lvalue
| | | | |-ImplicitCastExpr 0x13e556870a8 <col:17> 'const int *' <LValueToRValue>
| | | | | `-DeclRefExpr 0x13e55687060 <col:17> 'const int *' lvalue ParmVar 0x13e55545a30 'pa' 'const int *'
| | | | `-IntegerLiteral 0x13e55687080 <col:20> 'int' 0
| | | `-IntegerLiteral 0x13e556870e0 <col:26> 'int' 1
| | `-ParenExpr 0x13e55687240 <col:32, col:43> 'int'
| | `-BinaryOperator 0x13e55687220 <col:33, col:42> 'int' '=='
| | |-ImplicitCastExpr 0x13e55687208 <col:33, col:37> 'int' <LValueToRValue>
| | | `-ArraySubscriptExpr 0x13e556871c0 <col:33, col:37> 'const int' lvalue
| | | |-ImplicitCastExpr 0x13e556871a8 <col:33> 'const int *' <LValueToRValue>
| | | | `-DeclRefExpr 0x13e55687160 <col:33> 'const int *' lvalue ParmVar 0x13e55545a30 'pa' 'const int *'
| | | `-IntegerLiteral 0x13e55687180 <col:36> 'int' 1
| | `-IntegerLiteral 0x13e556871e0 <col:42> 'int' 1
| `-ParenExpr 0x13e55687360 <col:48, col:59> 'int'
| `-BinaryOperator 0x13e55687340 <col:49, col:58> 'int' '=='
| |-ImplicitCastExpr 0x13e55687328 <col:49, col:53> 'int' <LValueToRValue>
| | `-ArraySubscriptExpr 0x13e556872e0 <col:49, col:53> 'int' lvalue
| | |-ImplicitCastExpr 0x13e556872c8 <col:49> 'int *' <LValueToRValue>
| | | `-DeclRefExpr 0x13e55687280 <col:49> 'int *' lvalue ParmVar 0x13e55545ae0 'pb' 'int *'
| | `-IntegerLiteral 0x13e556872a0 <col:52> 'int' 0
| `-IntegerLiteral 0x13e55687300 <col:58> 'int' 2
`-CompoundStmt 0x13e55687468 <col:62, line:13:5>
`-BinaryOperator 0x13e55687448 <line:12:9, col:17> 'int' '='
|-ArraySubscriptExpr 0x13e55687400 <col:9, col:13> 'int' lvalue
| |-ImplicitCastExpr 0x13e556873e8 <col:9> 'int *' <LValueToRValue>
| | `-DeclRefExpr 0x13e556873a0 <col:9> 'int *' lvalue ParmVar 0x13e55545ae0 'pb' 'int *'
| `-IntegerLiteral 0x13e556873c0 <col:12> 'int' 0
`-IntegerLiteral 0x13e55687420 <col:17> 'int' 4
4 matches.
clang-query>