0

I have some problem with finding whole class text using RegEx in C#. I need whole structure, including "public class ..." to last parenthesis, because it will be compiled as dynamic code (using CSharpCodeProvider object).

Here is sample code of my class:

    [Worker("Structure")]
    public class DataSourceStructure
    {
        DataSet mainData = new DataSet();
        DataTable worker = new DataTable("Worker");
        DataTable year = new DataTable("Year");

        public DataSet MainSource 
        {       
           get 
           {
            worker.Columns.Add("Name");
            worker.Columns.Add("MonthSallary");
            worker.Columns.Add("DateOfBirth");
            worker.Columns.Add("WorkDescription");
            worker.Columns.Add("Sex");
            worker.Columns.Add("Worker_Id", typeof(int));

            year.Columns.Add("YearOfEmployment");
            year.Columns.Add("Worker_Id", typeof(int));

            mainData.Tables.Add(worker);
            mainData.Tables.Add(year);

            DataRelation rel = new DataRelation("Worker_Year", mainData.Tables["Worker"].Columns["Worker_Id"], mainData.Tables["Year"].Columns["Worker_Id"], true);

            mainData.Relations.Add(rel);

            return mainData;
           }
           set 
           { 
            mainData = value; 
           } 
       }
   }

I tried some ways described on StackOverflow (for example: Using RegEx to balance match parenthesis), but it doesn't work for me... or I don't know how to rebuild it correctly. ;/

Thanks in advance for every help.

Community
  • 1
  • 1
Maciej S.
  • 752
  • 10
  • 22
  • You do have one class per file? – Thomas Ayoub Apr 11 '16 at 13:41
  • Unfortunatelly no - there are many classes and attributes definitions. – Maciej S. Apr 11 '16 at 13:42
  • `(public class .*})\s+\b\w+\b class` using singleLine option ? – Thomas Ayoub Apr 11 '16 at 13:45
  • 3
    I think that this could easily fall over into the category of items where you have to abandon regex and use a *parser*. Especially once you realise that just balancing braces isn't sufficient when strings and comments are allowed to contain unmatched braces. – Damien_The_Unbeliever Apr 11 '16 at 13:51
  • 1
    Thomas - Your pattern works (no errors) but it can't be used, because there are many classes and attributes definitions so I get one match consists of whole text in file. ;/ – Maciej S. Apr 11 '16 at 14:02

1 Answers1

1

A regular expression will not always work for your use-case. This is because C# grammar is quite complex (context-free with some context specific rules) and a regular expression can only express/parse regular grammars.

Consider using Roslyn instead. Here is a tutorial and a simple snippet of code to get you started.

var code = new StreamReader("path/to/cs/file/here").ReadToEnd();
var tree = SyntaxTree.ParseCompilationUnit(code);
var classes = root.DescendantNodes().OfType<ClassDeclarationSyntax>();

Hope this helps!

Community
  • 1
  • 1
Ani
  • 10,826
  • 3
  • 27
  • 46
  • What happens if there is another class or something else after the class closing braces? – Pradeep Kumar Apr 11 '16 at 14:21
  • ananthonline - thank You very much for answer, but we can't use any externall DLLs to achieve this task. – Maciej S. Apr 11 '16 at 14:27
  • The Roslyn parser is used in the C# compiler. It is robust enough to handle all of those conditions and more. – Ani Apr 11 '16 at 14:28
  • Yes, You're right. I will talk to my manager and try to convince him to Roslyn. Thank You and I accept Your solution. – Maciej S. Apr 11 '16 at 14:46