13

I have an instance of CXCursor of kind CXCursor_CXXMethod. I want to find out if the function is const or volatile, for example:

class Foo {
public:
    void bar() const;
    void baz() volatile;
    void qux() const volatile;
};

I could not find anything useful in the documentation of libclang. I tried clang_isConstQualifiedType and clang_isVolatileQualifiedType but these always seem to return 0 on C++ member function types.

user1071136
  • 15,636
  • 4
  • 42
  • 61

2 Answers2

19

I can think of two approaches:

Using the libclang lexer

The code which appears in this SO answer works for me; it uses the libclang tokenizer to break a method declaration apart, and then records any keywords outside of the method parentheses.

It does not access the AST of the code, and as far as I can tell doesn't involve the parser at all. If you are sure the code you investigate is proper C++, I believe this approach is safe.

Disadvantages: This solution does not appear to take into account preprocessing directives, so the code has to be processed first (e.g., passed through cpp).

Example code (the file to parse must be the first argument to your program, e.g. ./a.out bla.cpp):

#include "clang-c/Index.h"
#include <string>
#include <set>
#include <iostream>

std::string GetClangString(CXString str)
{
  const char* tmp = clang_getCString(str);
  if (tmp == NULL) {
    return "";
  } else {
    std::string translated = std::string(tmp);
    clang_disposeString(str);
    return translated;
  }
}

void GetMethodQualifiers(CXTranslationUnit translationUnit,
                         std::set<std::string>& qualifiers,
                         CXCursor cursor) {
  qualifiers.clear();

  CXSourceRange range = clang_getCursorExtent(cursor);
  CXToken* tokens;
  unsigned int numTokens;
  clang_tokenize(translationUnit, range, &tokens, &numTokens);

  bool insideBrackets = false;
  for (unsigned int i = 0; i < numTokens; i++) {
    std::string token = GetClangString(clang_getTokenSpelling(translationUnit, tokens[i]));
    if (token == "(") {
      insideBrackets = true;
    } else if (token == "{" || token == ";") {
      break;
    } else if (token == ")") {
      insideBrackets = false;
    } else if (clang_getTokenKind(tokens[i]) == CXToken_Keyword && 
             !insideBrackets) {
      qualifiers.insert(token);
    }
  }

  clang_disposeTokens(translationUnit, tokens, numTokens);
}

int main(int argc, char *argv[]) {
  CXIndex Index = clang_createIndex(0, 0);
  CXTranslationUnit TU = clang_parseTranslationUnit(Index, 0, 
          argv, argc, 0, 0, CXTranslationUnit_None);

  // Set the file you're interested in, and the code location:
  CXFile file = clang_getFile(TU, argv[1]);
  int line = 5;
  int column = 6;
  CXSourceLocation location = clang_getLocation(TU, file, line, column);
  CXCursor cursor = clang_getCursor(TU, location);

  std::set<std::string> qualifiers;
  GetMethodQualifiers(TU, qualifiers, cursor);

  for (std::set<std::string>::const_iterator i = qualifiers.begin(); i != qualifiers.end(); ++i) {
    std::cout << *i << std::endl;
  }

  clang_disposeTranslationUnit(TU);
  clang_disposeIndex(Index);
  return 0;
}

Using libclang's Unified Symbol Resolution (USR)

This approach involves using the parser itself, and extracting qualifier information from the AST.

Advantages: Seems to work for code with preprocessor directives, at least for simple cases.

Disadvantages: My solution parses the USR, which is undocumented, and might change in the future. Still, it's easy to write a unit-test to guard against that.

Take a look at $(CLANG_SRC)/tools/libclang/CIndexUSRs.cpp, it contains the code that generates a USR, and therefore contains the information required to parse the USR string. Specifically, lines 523-529 (in LLVM 3.1's source downloaded from www.llvm.org) for the qualifier part.

Add the following function somewhere:

void parseUsrString(const std::string& usrString, bool* isVolatile, bool* isConst, bool *isRestrict) {
  size_t bangLocation = usrString.find("#");
  if (bangLocation == std::string::npos || bangLocation == usrString.length() - 1) {
    *isVolatile = *isConst = *isRestrict = false;
    return;
  }
  bangLocation++;
  int x = usrString[bangLocation];

  *isConst = x & 0x1;
  *isVolatile = x & 0x4;
  *isRestrict = x & 0x2;
}

and in main(),

CXString usr = clang_getCursorUSR(cursor);
const char *usr_string = clang_getCString(usr);
std::cout << usr_string << "\n";
bool isVolatile, isConst, isRestrict;
parseUsrString(usr_string, &isVolatile, &isConst, &isRestrict);
printf("restrict, volatile, const: %d %d %d\n", isRestrict, isVolatile, isConst);
clang_disposeString(usr);

Running on Foo::qux() from

#define BLA const

class Foo {
public:
    void bar() const;
    void baz() volatile;
    void qux() BLA volatile;
};

produces the expected result of

c:@C@Foo@F@qux#5
restrict, volatile, const: 0 1 1

Caveat: you might have noticed that libclang's source suggets my code should be isVolatile = x & 0x2 and not 0x4, so it might be the case you should replace 0x4 with 0x2. It's possible my implementation (OS X) has them replaced.

Community
  • 1
  • 1
user1071136
  • 15,636
  • 4
  • 42
  • 61
  • 1
    Does this work when the `const` or `volatile` qualifiers occur only after preprocessing? –  Aug 26 '12 at 14:57
  • 1
    No; libclang lexes the file you provide as is. You can solve this by running your file through a preprocessor before handing it to this code, i.e. `cpp bla.cpp > bla.cpp.pre`. – user1071136 Aug 26 '12 at 15:03
  • 1
    Added another solution which seems to work with preprocessing. – user1071136 Aug 26 '12 at 17:35
  • 1
    This was a very helpful comment - I couldn't figure out how to read `const` qualifiers in `clang.cindex`. However, it contains a subtle bug: you need to apply this logic to the byte after the *last*, not the *first*, `#` in the USR data. There could be multiple `#` if the function parameters or return value are `const`-qualified. – Michael Koval Sep 04 '15 at 03:51
  • 1
    Also, in case anyone else stumbles upon this, this logic is now in `lib/Index/USRGeneration.cpp` in LLVM 3.6.2. – Michael Koval Sep 04 '15 at 03:53
0

You can detect pure virtual/const function using clang_getCursorPrettyPrinted() . This function gives you the method/function prototype in full (virtual, const, =0, etc. -- everything as you see in the source code). If you can get a cursor for the desired function/method it's all you need. The answers from before show you how to get a cursor.

The code below is written in C++ but you can translate it to C because it uses libclang.

Example how to check for const method (getAsStdString is defined below):

auto funcPrettyPrinted = getAsStdString(
        clang_getCursorPrettyPrinted(cursor, nullptr));
if (std::string::npos != funcPrettyPrinted.find(") const"))
{
    break;
}

Example how to check for virtual function:

auto funcPrettyPrinted = getAsStdString(
        clang_getCursorPrettyPrinted(cursor, nullptr));
if (std::string::npos != funcPrettyPrinted.find("virtual") &&
    std::string::npos != funcPrettyPrinted.find("= 0"))
{
    break;
}

Here is an example of pretty printed output:

virtual void one() = 0

The less important but useful helper function:

std::string getAsStdString(CXString str)
{
    auto cstr = clang_getCString(str);
    if (nullptr == cstr)
    {
       return "";
    }
    std::string stdStr{cstr};
    clang_disposeString(str);
    return stdStr;
}

And of course you can use regex to make sure the searches are 100% correct... if you need to.