2

I am working on a Reverse Engineering school project, which requires to translate manipulate AST of compiled C# project. I have seen the post on "Translate C# code into AST?" in this website, but it doesn't look like the one I am looking for.

According to what I know, currently C# doesn't provide a library class that does something like that for Java: http://help.eclipse.org/help33/index.jsp?topic=/org.eclipse.cdt.doc.isv/reference/api/org/eclipse/cdt/core/dom/ast/ASTVisitor.html. If there is such library class in C#, everything here is solved.

I have consulted with someone, and here are the possible solutions. But I have problems with working out on the solutions as well:

  1. Find another compiler that provides a library which allows its AST to be expose for manipulation. But I can't find a compiler like that.
  2. Use ANTLR Parser Generator to come out with my own compiler that does that (it will be a much more difficult and longer process). The download there provides sample grammars for different languages but not C# (it has grammars written in various languages including C# but not to produce C# grammar). Hence the problem is I can't find C# grammar.

What is shortest and fastest way to approach this issue? If I really have to take one of the alternative above, how should I go about solving those problems I faced.

halfer
  • 19,824
  • 17
  • 99
  • 186
yeeen
  • 4,911
  • 11
  • 52
  • 73
  • I'm unclear on whether you want to manipulate C# ASTs using C#, or just manipulate ASTs using any tools at all. And I'm confused by your remark at about ANTLR: I thought there was a C# 3.0 grammar that ANTLR could process to parse and build C# trees. – Ira Baxter Aug 21 '09 at 01:27

5 Answers5

4

I know the answer for this one was accepted long ago. But I had a similar question and wasn't sure of the options out there. I did a little investigation of the NRefactory library that ships as part of SharpDevelop. It does generate an AST from C# code.

Here's an image of the NRefactory demo application that is part of the SD source code. Type in some C# code and it generates and displays the AST in a treeview.

alt text

Cheeso
  • 189,189
  • 101
  • 473
  • 713
1

Why don't you try NRefectory. I've seen it discussed for AST thing on some SharepDevelop forums.

Here is an article on CodeProject regarding this topic.

TheVillageIdiot
  • 40,053
  • 20
  • 133
  • 188
0

A full C# 3.0 parser is available with our DMS Software Reengineering Toolkit (DMS for short). It has been used to process tens of thousands of C# files accurately. It provides automated AST building, tree traversals, surface-syntax pattern matching and transformation and lots more. As a commercial product it might not work out for a student project.

ANTLR arguably offers a C# parser, but I don't know complete or robust it is, or whether it actually builds ASTs.

[EDIT Jan 25 2010: C# 4.0 parser now available for DMS with all the above properties]

[EDIT May 2016: C# 6.0 parser available for DMS.]

Ira Baxter
  • 93,541
  • 22
  • 172
  • 341
  • To clarify better: I wanted to do sth that allows me to parse in a C# file or directory with a C# project and it will be able to build an AST which allows me to do manipulation to it. Manipulation to reverse engineering C# codes to UML. – yeeen Aug 21 '09 at 02:32
  • UML has a variety of submodels. Producing class diagrams from a parse tree with a symbol table should be pretty easy. Producing state chart diagrams is likely pretty hard. We have some experience using DMS to process Java source code into OMG Executable UML, and that was a bit tricky; it needed full control and data flow analysis. None of the latter is available with ANTLR or any general parsing system I know of, except for DMS. YMMV. – Ira Baxter Aug 21 '09 at 02:42
  • DMS - Document management system? YMMV - ??? Anyway i am only required to reverse engineer into UML class diagram. I am now working on ANTLR, as it seems like the only way I can use to manipulate the AST. Later on will be required to use Class Hierarchy Analysis approach to convert to UML, which is supposed to be stored in a database. Need to wait for future instruction for the UML part. I got another qn posted here: "http://stackoverflow.com/questions/1291153/building-own-c-compiler-using-antlr-compilation-unit". Still working at it. Maybe u can give some guidance. – yeeen Aug 21 '09 at 08:00
  • DMS == "DMS Software Reengineering Toolkit", see my answer above. YMMV == "Your mileage may vary". To get the class diagram, you will need a full parser, and I think you'll need a symbol table (or at least you'll have to hack something that records what entities are classes and hope that no two of them have the same name). – Ira Baxter Aug 21 '09 at 11:37
0

ANTLR is not a good choice. I am now trying out using Mono Cecil instead. Mono Cecil is good for analyzing any souce codes that can be compiled into Common Intermediate Language (CIL). The disadvantage is that it doesn't have properly documentation.

yeeen
  • 4,911
  • 11
  • 52
  • 73
0

I've just answered on another thread here at StackOverflow a solution where I implemented an API to create and manipulate AST from C# Source Code

Community
  • 1
  • 1
Dinis Cruz
  • 4,161
  • 2
  • 31
  • 49