I currently go trought all my source files and read their text with File.ReadAllLines
and i want to filter all comments with one regex. Basically all comment possiblities. I tried several regex solutions i found on the internet. As this one:
@"(@(?:""[^""]*"")+|""(?:[^""\n\\]+|\\.)*""|'(?:[^'\n\\]+|\\.)*')|//.*|/\*(?s:.*?)\*/"
And the top result when i google:
string blockComments = @"/\*(.*?)\*/";
string lineComments = @"//(.*?)\r?\n";
string strings = @"""((\\[^\n]|[^""\n])*)""";
string verbatimStrings = @"@(""[^""]*"")+";
See: Regex to strip line comments from C#
The second solution won't recognize any comments.
Thats what i currently do
public static List<string> FormatList(List<string> unformattedList, string dataType)
{
List<string> formattedList = unformattedList;
string blockComments = @"/\*(.*?)\*/";
string lineComments = @"//(.*?)\r?\n";
string strings = @"""((\\[^\n]|[^""\n])*)""";
string verbatimStrings = @"@(""[^""]*"")+";
string regexCS = blockComments + "|" + lineComments + "|" + strings + "|" + verbatimStrings;
//regexCS = @"(@(?:""[^""]*"")+|""(?:[^""\n\\]+|\\.)*""|'(?:[^'\n\\]+|\\.)*')|//.*|/\*(?s:.*?)\*/";
string regexSQL = "";
if (dataType.Equals("cs"))
{
for(int i = 0; i < formattedList.Count;i++)
{
string line = formattedList[i];
line = line.Trim(' ');
if(Regex.IsMatch(line, regexCS))
{
line = "";
}
formattedList[i] = line;
}
}
else if(dataType.Equals("sql"))
{
}
else
{
throw new Exception("Unknown DataType");
}
return formattedList;
}
The first Method recognizes the comments, but also finds things like
string[] bla = text.Split('\\\\');
Is there any solution to this problem? That the regex excludes the matches which are in a string/char? If you have any other links i should check out please let me know!
I tried a lot and can't figure out why this won't work for me.
[I also tried these links]
https://blog.ostermiller.org/find-comment
https://codereview.stackexchange.com/questions/167582/regular-expression-to-remove-comments