After some research and a few questions, I ended up exploring libclang library in order to parse C++ source files in Python.
Given a C++ source
int fac(int n) {
return (n>1) ? n∗fac(n−1) : 1;
}
for (int i = 0; i < linecount; i++) {
sum += array[i];
}
double mean = sum/linecount;
I am trying to identify the tokens fac
as a function name, n
as variable name, i
as a variable name, mean
as variable name, along with each ones position. I interested in eventually tokenizing them.
I have read some very useful articles (eli's, Gaetan's) as well as some stack overflow questions 35113197, 13236500.
However, given I am new in Python and struggling to understand the basics of libclang, I would very much appreciate some example chunk of code which implements the above for me to pick up and understand from.