You can use an existing C parser for Java. It does a lot more than parsing header files, of course, but that shouldn't hurt you.
We use the parser from the Eclipse CDT project. This is an Eclipse plugin, but we sucessfully use it outside of Eclipse, we just have to bundle 3 JAR files of Eclipse with the parser JAR.
To use the CDT parser, start with an implementation of org.eclipse.cdt.core.model.ILanguage
, for example org.eclipse.cdt.core.dom.ast.gnu.c.GCCLanguage
. You can call getTranslationUnit
on it, passing the code and some helper stuff. A code file is represented by a org.eclipse.cdt.core.parser.FileContent
instance (at least in CDT7, this seems to change a lot). The easiest way to create such an object is FileContent.createForExternalFileLocation(filename)
or FileContent.create(filename, content)
. This way you don't need to care about the Eclipse IFile
stuff, which seems to work only within projects and workspaces.
The IASTTranslationUnit
you get back represents the whole AST of the file. All the nodes therein are instances of IASTSomething
types, for example IASTDeclaration
etc. You can implement your own subclass of org.eclipse.cdt.core.dom.ast.ASTVisitor
to iterate through the AST using the visitor pattern. If you need further help, just ask.
The JAR files we use are org.eclipse.cdt.core.jar
, org.eclipse.core.resources.jar
, org.eclipse.equinox.common.jar
, and org.eclipse.osgi.jar
.
Edit: I had found a paper which contains source code snippets for this:
"Using the Eclipse C/C++ Development Tooling as a Robust, Fully Functional, Actively Maintained, Open Source C++ Parser", but it is no longer available online (only as a shortened version).