3

I have been trying to obtain ASTs from Clang but I have not been successfully so far. I found a one year old question here at stack that mentions about two other ways to obtain the ast using Clang which are:

./llvmc -cc1 -ast-dump file.c

./llvmc -cc1 -ast-print file.c

On this question doxygen is mentioned and a representation where an ast is given but I am mostly looking for one on some textual form such as XML so that further analysis can be performed.

lastly there was another question here on stack about exactly XML import but it was discontinued for several reasons also mentioned.

My question thus is, which version and how can I use it from the console to obtain ast related information for a given code in C? I believe this to be a very painless one line command code like those above but the documentation index did not refer anything about ast from as much as I have read and the only one at llvmc I found was about writing an AST by hand which is not really what I am looking for.

I tried all of the commands above but they all already fail on version 2.9 and I already found out llvm changes a whole lot between each version.

Thank you.

Community
  • 1
  • 1
  • Doxygen AFAIK doesn't produce anything resembling an AST. – Ira Baxter Jun 08 '12 at 20:27
  • That other answer your mention says the XML export format was removed from Clang. IMHO, this is a good thing, as a real XML tree for a C program of any size will be simply monstrous, and XML is pretty awkward to manipulate; you want to use the facilities built into the tool instead to access the AST. – Ira Baxter Jun 08 '12 at 20:28
  • ... is Clang the only tool you will consider? – Ira Baxter Jun 08 '12 at 20:29
  • Can you provide some detail as to what you want to do with the AST? Analyze for some problem? Transform? Extract? ...? – Ira Baxter Jun 08 '12 at 21:03
  • I misinterpreted what you wrote, sorry. I took it to mean that Doxygen-the-tool-itself produces ASTs; I think you meant the Doxygen-output-from-Clang contains descriptions of ASTs. – Ira Baxter Jun 08 '12 at 22:06
  • ... "Need it to be textually written so I can perform operations on it". You cant do much with a textual representation of a tree with sed or perl. You can arguably do more if your "text representation" can really be processed by a tree, e.g., XML or Lisp notation. You need lots more than just a tree, though. You've had a compiler class? – Ira Baxter Jun 08 '12 at 22:09
  • "For validation purposes"? I'm not sure what you are asking. Whether I recommend DMS for many program analysis/transformation purposes, the answer is yes. But its my concept and product, and many people distrust the advice from the purveyor of a thing. Fair enough. I try to tell people DMS has useful structure, but expect they form their own judgement. I do that believing they will decide what I am them telling them is accurate, but I'd prefer they arrive at the same conclusion by themselves after examining the facts. – Ira Baxter Jun 09 '12 at 02:01

1 Answers1

3

OP says "open to other suggestions as well".

He might consider our DMS Software Reengineering Toolkit with its C Front End.

While it would be pretty easy to exhibit a C AST produced by this, it is easier to show an ObjectiveC AST [already at SO] produced by DMS using the same C front end (ObjectiveC is a dialect of C). See https://stackoverflow.com/a/10749970/120163 DMS can produce an XML equivalent of this, too.

We don't recommend exporting trees as text, because real trees for real code are simply enormous and are poorly manipulated as text or XML objects in our experience. One usually needs machinery beyond parsing. See my discussion about Life After Parsing

Community
  • 1
  • 1
Ira Baxter
  • 93,541
  • 22
  • 172
  • 341
  • 1
    GCC, Clang, FramaC, CIL are FOSS solutions I know about. Each of these addresses to some extent the "Life After Parsing" issues. I build DMS because I believe these solutions are not enough; notice all these tools are C or C++ specific, which does not address the problem of addressing large, complex software systems. Whether these are adequate for your problem will require judgement on your part. – Ira Baxter Jun 09 '12 at 02:00