4

I have the following code that works, but would like to edit it up using LINQ to find if any of the Regex search strings are in the target.

foreach (Paragraph comment in
            wordDoc.MainDocumentPart.Document.Body.Descendants<Paragraph>().Where<Paragraph>(comment => comment.InnerText.Contains("cmt")))
{
    //print values
}

More precisely I have to select through LINQ if the string start with letters or start with symbols - or

This Regex is correct for my case ?

string pattern = @"^[a-zA-Z-]+$";
Regex rg = new Regex(pattern);

Any suggestion please?

Thanks in advance for any help

  • One thing is unclear: you have `comment => comment.InnerText.Contains("cmt")` which fetches items that contain `cmt` anywhere in the string, but in [your next question](https://stackoverflow.com/q/63976797/3832970) you say the string should start with `cmt`. Is it due to the fact these are for different scenarios? – Wiktor Stribiżew Sep 20 '20 at 09:34

2 Answers2

1

You can. It would be better to use query syntax though, as described here: https://learn.microsoft.com/en-us/dotnet/csharp/programming-guide/concepts/linq/how-to-combine-linq-queries-with-regular-expressions

Example:

var queryMatchingFiles =  
            from file in fileList  
            where file.Extension == ".htm"  
            let fileText = System.IO.File.ReadAllText(file.FullName)  
            let matches = searchTerm.Matches(fileText)  
            where matches.Count > 0  
            select new  
            {  
                name = file.FullName,  
                matchedValues = from System.Text.RegularExpressions.Match match in matches  
                                select match.Value  
            };  

Your pattern is fine, just remove the $ from the end and add any character

 @"^[a-zA-Z-]+. *"
Athanasios Kataras
  • 25,191
  • 4
  • 32
  • 61
  • Many thanks for help. I'm sorry but I don't understand: I need replace my `foreach` with your example or what? Please explain me... –  Sep 19 '20 at 18:07
0

Your regex should be modified as

^[\p{L}•-]

To also allow whitespace at the start of the string add \s and use

^[\p{L}\s•-]

Details

  • ^ - start of string
  • [\p{L}•-] - a letter, or -
  • [\p{L}•-] - a letter, whitespace, or -

In C#, use

var reg = new Regex(@"^[\p{L}•-]");
foreach (Paragraph comment in
    wordDoc.MainDocumentPart.Document.Body.Descendants<Paragraph>()
       .Where<Paragraph>(comment => reg.IsMatch(comment.InnerText)))
{
    //print values
}

If you want to match those items containing cmt and also matching this regex, you may adjust the pattern to

var reg = new Regex(@"^(?=.*cmt)[\p{L}\s•-]", RegexOptions.Singleline);

If you need to only allow cmt at the start of the string:

var reg = new Regex(@"^(?:cmt|[\p{L}\s•-])");
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
  • Many thanks for reply. Your `regexp` working perfectly but I'm sorry but now I need check select through `LINQ` if the string start with letters or start with symbols `-` or `•` or `whitespace`, I have tried without success `^[\p{L}•- ]` –  Sep 20 '20 at 07:14
  • And your `regexp` not validate accented characters (Diacritics) as `E’ ` –  Sep 20 '20 at 07:27
  • @Kooper Diacritic is already the second char, so if the first one is a letter, it is already fine. To check if a string starts with whitespace, use `^[\p{L}\s•-]`. Always [put hyphen at the closing bracket](https://stackoverflow.com/questions/3697202). – Wiktor Stribiżew Sep 20 '20 at 09:22