18

Is it currently possible to translate C# code into an Abstract Syntax Tree?

Edit: some clarification; I don't necessarily expect the compiler to generate the AST for me - a parser would be fine, although I'd like to use something "official." Lambda expressions are unfortunately not going to be sufficient given they don't allow me to use statement bodies, which is what I'm looking for.

Erik Forbes
  • 35,357
  • 27
  • 98
  • 122

12 Answers12

20

The Roslyn project is in Visual Studio 2010 and gives you programmatic access to the Syntax Tree, among other things.

SyntaxTree tree = SyntaxTree.ParseCompilationUnit(
    @" C# code here ");
var root = (CompilationUnitSyntax)tree.Root;
Mustafa Özçetin
  • 1,893
  • 1
  • 14
  • 16
Paul Rubel
  • 26,632
  • 7
  • 60
  • 80
12

Is it currently possible to translate C# code into an Abstract Syntax Tree?

Yes, trivially in special circumstances (= using the new Expressions framework):

// Requires 'using System.Linq.Expressions;'
Expression<Func<int, int>> f = x => x * 2;

This creates an expression tree for the lambda, i.e. a function taking an int and returning the double. You can modify the expression tree by using the Expressions framework (= the classes from in that namespace) and then compile it at run-time:

var newBody = Expression.Add(f.Body, Expression.Constant(1));
f = Expression.Lambda<Func<int, int>>(newBody, f.Parameters);
var compiled = f.Compile();
Console.WriteLine(compiled(5)); // Result: 11

Notice that all expressions are immutable so they have to be built anew by composition. In this case, I've prepended an addition of 1.

Notice that these expression trees only work on real expressions i.e. content found in a C# function. You can't get syntax trees for higher constructs such as classes this way. Use the CodeDom framework for these.

Konrad Rudolph
  • 530,221
  • 131
  • 937
  • 1,214
  • Erik accepted this? It uses the very lambda forms he said he didn't want. – Ira Baxter Oct 20 '09 at 02:37
  • 1
    Ira: you should pay attention to the development of the discussion. This entry was posted *before* Erik’s edit/clarification. Apparently, none of the other answers were better *at the time* (notice: *one year ago!*) so he didn’t accept another answer. Your answer is probably what he would have wanted. – Konrad Rudolph Oct 20 '09 at 08:16
6

Check out .NET CodeDom support. There is an old article on code project for a C# CodeDOM parser, but it won't support the new language features.

There is also supposed to be support in #develop for generating a CodeDom tree from C# source code according to this posting.

Glorfindel
  • 21,988
  • 13
  • 81
  • 109
Rob Walker
  • 46,588
  • 15
  • 99
  • 136
4

There is much powerful than R# project. Nemerle.Peg:

https://code.google.com/p/nemerle/source/browse/nemerle/trunk/snippets/peg-parser/

And it has C# Parser which parsers all C# code and translates it to AST !

https://code.google.com/p/nemerle/source/browse/nemerle/trunk/snippets/csharp-parser/

You can download installer here: https://code.google.com/p/nemerle/

NN_
  • 1,593
  • 1
  • 10
  • 26
3

Personally, I would use NRefactory, which is free, open source and gains popularity.

konrad.kruczynski
  • 46,413
  • 6
  • 36
  • 47
2

ANTLR is not very useful. LINQ is not what you want.

Try Mono.Cecil! http://www.mono-project.com/Cecil

It is used in many projects, including NDepend! http://www.ndepend.com/

halfer
  • 19,824
  • 17
  • 99
  • 186
yeeen
  • 4,911
  • 11
  • 52
  • 73
2

It looks like this sort of functionality will be included with whatever comes after C# 4, according to Anders Hejlsberg's 'Future of C#' PDC video.

Erik Forbes
  • 35,357
  • 27
  • 98
  • 122
  • This is helpful to see what C# don't offer a library for us to manipulate C# API. It is due to it's compiler is a classical one, a black box! – yeeen Oct 04 '09 at 06:51
2

The ANTLR Parser Generator has a grammar for C# 3.0 which covers everything except for LINQ syntax.

Erik Forbes
  • 35,357
  • 27
  • 98
  • 122
  • I've used ANTLR in the past, and it's quite nice. I haven't used the C# grammar, but most of the contributors there are pretty cluey. – tsimon Nov 25 '08 at 21:59
1

Our C# front end for DMS parses full C# 3.0 including LINQ and produces ASTs. DMS in fact is an ecosystem for analyzing/transforming source code using ASTs for front-end provided langauges.

EDIT 3/10/2010: ... Now handles full C# 4.0

EDIT: 6/27/2014: Handles C# 5.0 since quite awhile.

EDIT: 6/15/2016: Handles C# 6.0. See https://stackoverflow.com/a/37847714/120163 for a sample AST.

Community
  • 1
  • 1
Ira Baxter
  • 93,541
  • 22
  • 172
  • 341
  • @Cheeso: Hmm, 2004 would mean we scooped MS. Well, it never do to suggest that, so I modified it say 2010. Fixed. – Ira Baxter Dec 29 '10 at 00:39
1

I've just answered on another thread here at StackOverflow a solution where I implemented an API to create and manipulate AST from C# Source Code

Community
  • 1
  • 1
Dinis Cruz
  • 4,161
  • 2
  • 31
  • 49
0

Please see the R# project (sorry the docs are in Russian, but there are some code examples). It allows AST manipulations on C# code.

http://www.rsdn.ru/projects/rsharp/article/rsharp_mag.xml

Project's SVN is here: (URL updated, thanks, derigel)

Also please see the Nemerle language. It is a .Net language with strong support for metaprogramming.

Community
  • 1
  • 1
Alexander Gladysh
  • 39,865
  • 32
  • 103
  • 160
0

It is strange that nobody suggested hacking the existing Mono C# compiler.

SK-logic
  • 9,605
  • 1
  • 23
  • 35