-4

I'm kinda new to Regex and C#.

I'm trying to build a tool that goes through list of files, and after that retrun the name of the file if it contains a certain pattern, im using the block below to go through the list

var queryMatchingFiles =
    from file in fileList
    where file.Extension != ".dll" && file.Extension != ".pdb"
    let fileText = File.ReadAllText(file.FullName)
    let matches = Regex.Matches(fileText, pattern, RegexOptions.IgnoreCase)
    where matches.Count > 0
    select new {
        name = file.FullName,
        matchedValues =
            from Match match in matches
            select match.Value
    };

Now the input for the pattern in the file is .htc, I know that a dot in Regex means any letter, i tried to do the to make sure the pattern is forced to be .htc

pattern = @"\b" + pattern + @"\b";

or

pattern = string.Format(@"\b" + pattern + @"\b");

and it still doesnt accept the dot in .htc, any ideas how to topple this problem?

EDIT: Im not looking for the file extention, what im trying to do is scan through HTML and Text files content, and see if it contains certain words like .htc

EDIT 2: Thank you guys for your answers, pattern = Regex.Escape(".htc"); is what i was looking for!

  • 5
    Using regular expressions to check file extensions sounds like a good way to get 2 problems instead of 1. Use the BCL APIs for Paths: http://msdn.microsoft.com/en-us/library/system.io.path.getextension%28v=vs.110%29.aspx ...or simply `EndsWith`. :) – bzlm Jan 20 '15 at 20:28
  • Can you please provide me with an example? and im not checking the file extentions, im scanning different text files and html pages to see if the pages contains `.htc` – Hassan A. Al-Rawi Jan 20 '15 at 20:31
  • From the link `extension = Path.GetExtension(fileName)` – Jasen Jan 20 '15 at 20:31
  • 2
    There are examples on the link I provided. If you're new to MSDN, which it sounds like, then congratulations; MSDN is a gold mine for people wanting to learn about .NET in general and C# specifically. To dig out actual file names from the text files you open (they are text files, right?) you still need an intermediary step, unless the files *only* contain file names, in which case the examples on MSDN should be enough, right? – bzlm Jan 20 '15 at 20:32
  • If you're simply scanning a file for specific text, you can simply use [`ReadAllText()`](http://msdn.microsoft.com/en-us/library/ms143368(v=vs.110).aspx) and then [`IndexOf()`](http://msdn.microsoft.com/en-us/library/system.string.indexof%28v=vs.110%29.aspx). – Erik Philips Jan 20 '15 at 20:37
  • @bzlm: Indeed, a gold mine with many hidden passages, rich treasures that are sometimes just buried far enough so you don't spot them at the first glance, galleries that look alike, and even more deceptively sparkling rocks. Sorry, could not resist ;) – O. R. Mapper Jan 20 '15 at 20:38

1 Answers1

1

The special regex characters have to be escaped, if they are meant as is.

pattern = @"\.htc";

You can do it automatically with

pattern = Regex.Escape(".htc");

Or use this solution if you only need a case insensitive "contains" function: https://stackoverflow.com/a/15464440/880990

Community
  • 1
  • 1
Olivier Jacot-Descombes
  • 104,806
  • 13
  • 138
  • 188