4

The ITranslationUnit and IASTTranslationUnit interfaces represent the translation unit and AST of a single C/C++ source file, respectively.

Is there any way to get the AST of an entire C++ project or do I need to start from the AST of the main file and navigate through the include directives and produce a separate AST for each source unit?

Thanks.

STiGMa
  • 906
  • 3
  • 11
  • 24

1 Answers1

6

CDT's AST is not designed to scale to an entire project. Once you start getting into the 10000+ LOC range, it's likely to start performing pretty badly.

For cross-file analysis purposes, CDT has an indexer, which parses each file in the project (one at a time), and builds a database of information about the code in the project as a whole (called the index). The index is accessed via the interface IIndex, an instance of which can be obtained (for example) by calling IASTTranslationUnit.getIndex() on any AST.

Most code analysis and manipulation use cases fall into one of the following workflows:

  • Just use the index. IIndex gives you a lot to work with, such as:

    • various overloads of findBindings() to find bindings matching a name or name prefix
    • findReferences(binding) to give you all references to a binding
    • findDeclarations(binding) to give you all declarations of a binding


    and many others. This is how editor navigation features like Open Declaration and Call Hierarchy work.

  • Use the index to identify a small set of source files for which you need ASTs, and then parse those. This is how refactorings work. For example, the rename refactoring uses the index to locate uses of the binding being renamed, and then creates ASTs for the files containing those uses to perform the refactoring on.

  • If neither of the above is good enough and you really need AST-level information for every file in the project, create an AST for every file in the project, one at a time, and extract the information you need from each one. This is how the indexer itself works. (Note that, if you choose this option, you don't need to navigate includes to list all the files you need to parse. Instead, you can just enumerate all the files in the project. See PDOMRebuildTask.createDelegate() for an example.)

If you say more about what your use case is, I may be able to provide more specific suggestions.

HighCommander4
  • 50,428
  • 24
  • 122
  • 194
  • I think you are saying CDT naturally parses a single source file but not its include files. So how does CDT parse " a most vexing parse" (or any other syntactically ambiguous construct) if the necessary declaration information is in an include file? See http://stackoverflow.com/questions/17388771/get-human-readable-ast-from-c-code/17393852#17393852 – Ira Baxter Dec 14 '16 at 08:26
  • @IraBaxter: The indexer parses files in dependency order, and looks up information from previously parsed files in the index. – HighCommander4 Dec 14 '16 at 16:09
  • Under some assumption about constancy of state of preprocessor conditional variables? (One can imagine one CU doing #define FLAG TRUE and another CU doing #define FLAG FALSE, and then both of them including a header that checks that flag) – Ira Baxter Dec 14 '16 at 17:21
  • @IraBaxter: Certain macros are identified as being "significant", and, where appropriate, header files are indexed once for every combination of significant macros. See https://bugs.eclipse.org/bugs/show_bug.cgi?id=197989 for details. The system isn't perfect - you can still confuse it if you try hard enough - but it works well enough to handle things like the Boost libraries in practice. – HighCommander4 Dec 14 '16 at 22:09
  • Where's the links to documentation for any of this? I'm trying to figure out how to parse C++ into an AST to traverse it but all searching is coming up with is posts like this with no link to documentation or how to get a foot in the door. – searchengine27 Aug 07 '20 at 17:54
  • @searchengine27 I'm not going to pretend that this stuff is particularly well-documented, but the public APIs at least do have Javadocs hosted [here](https://help.eclipse.org/2020-06/index.jsp?topic=%2Forg.eclipse.cdt.doc.isv%2Freference%2Fapi%2Foverview-summary.html) (e.g. [`IIndex`](https://help.eclipse.org/2020-06/index.jsp?topic=%2Forg.eclipse.cdt.doc.isv%2Freference%2Fapi%2Forg%2Feclipse%2Fcdt%2Fcore%2Findex%2FIIndex.html)). – HighCommander4 Aug 07 '20 at 21:07
  • `GCCLanguage.getDefault().getASTTranslationUnit(FileContent.createForExternalFileLocation(fileName), new ScannerInfo(definedMacros, includePaths), IncludeFileContentProvider.getEmptyFilesProvider(), null, ILanguage.OPTION_PARSE_INACTIVE_CODE, new DefaultLogService()).getIndex();` return null. Any idea about the problem? CDT Version : 11.0 – Veno Mar 09 '23 at 06:59