1

I need to use C# to search a directory (C:\Logs) for log files whose name starts with ACCESS. Once I find a file that begins with ACCESS I need to search that file and make a collection of strings that start with Identity=" " An example would be Identity="SWN\smithj" so I need everything from Identity to the last double quotes collected. After I have reached the end of the file, I need to go to the next file that begins with ACCESS. Can someone show me how to do this in C#?

Many thanks

Josh
  • 1,813
  • 4
  • 26
  • 33

2 Answers2

2

It looks like you've got two functions here:
1) Find the Files with names like ACCESS*
2) Search those files for lines like "Identity=*"

To do the first, use a DirectoryInfo object and the GetFiles() method with a search pattern of "ACCESS*".

DirectoryInfo myDir = new DirectoryInfo(dirPath);
var files = DirectoryInfo.GetFiles("ACCESS*");

Then you'll loop through those files looking for the data you need.

List<Tuple<string, string>> IdentityLines = new List<Tuple<string, string>>();//Item1 = filename, Item2 = line
foreach(FileInfo file in files)
{
    using(StreamReader sr = new StreamReader(file.FullName) //double check that file.FullName I don't remember for sure if it's right
    {
        while(!string.IsNullOrEmpty(string line = sr.Read())
        {
           if(line.StartsWith("Identity=")) 
              IdentityLines.Add(file.FileName, line);
        }
    }
}

This hasn't been compiled, so double check it, but it should be pretty close to what you need.

EDIT: Added full solution based on comments from OP. Has been compiled and run.

DirectoryInfo myDir = new DirectoryInfo(@"C:\Testing");
var Files = myDir.GetFiles("ACCESS*");

List<KeyValuePair<string, string>> IdentityLines = new List<KeyValuePair<string, string>>();

foreach(FileInfo file in Files)
{
    string line = "";
    using(StreamReader sr = new StreamReader(file.FullName))
    {
        while(!String.IsNullOrEmpty(line = sr.ReadLine()))
        {
           if(line.ToUpper().StartsWith("IDENTITY="))
              IdentityLines.Add(new KeyValuePair<string, string>(file.Name, line));
        }
    }
}

foreach(KeyValuePair<string, string> line in IdentityLines) 
{
    Console.WriteLine("FileName {0}, Line {1}", line.Key, line.Value);
}
AllenG
  • 8,112
  • 29
  • 40
  • Hi Allen- The line just before your loop, is that correct? What is tuple and what is IdentityLines? – Josh Aug 10 '10 at 15:06
  • A Tuple is a paired type. Basically it's just designed to create a correllation between two pieces of data. It's new to C#4, so if your using 3.5 or earlier, you can replace it with a List>. IdentityLines is just the variable name I gave to the list. Once you're done gathering your lines, you can then output them along with the file in which they were found. – AllenG Aug 10 '10 at 15:09
  • @Allen - I used List> IdentityLines; In the while loop on the last line I have IdentityLines.Add(file.Name, line); and I'm getting a "No overload method for ADD takes two arguments. Thoughts? – Josh Aug 10 '10 at 15:22
  • It's cause I missed a step. Try `IdentityLines.Add(new KeyValuePair(file.Name, line))`. You may have to play with it (thank goodness for intellisense) to get it exactly right. – AllenG Aug 10 '10 at 15:26
  • The "no overload method for 'ADD' takes two arguments" error still occurs on last line List> IdentityLines = new List>(); foreach(FileInfo file in files) { using(StreamReader sr = new StreamReader(file.FullName)) { string line; while(!string.IsNullOrEmpty(line = sr.Read().ToString())) { if (line.StartsWith("Identity=")) IdentityLines.Add(file.Name, line); } } } – Josh Aug 10 '10 at 15:27
  • oh ok- i didn't see your last post until now. Let me try that – Josh Aug 10 '10 at 15:28
  • With a StreamReader you can also read the entire file into memory with the StreamReader.ReadToEnd() method (works well for smallish files). – Edward Leno Aug 10 '10 at 15:28
  • that fixed the error. So now all string that start with Identity=" " will be stored in IdentityLines? I assume I can just print those out to a text file now? – Josh Aug 10 '10 at 15:30
  • @Edward: Indeed you can, but this way he can grab just what he needs and ignore the rest.
    @Josh: That's the theory. Like I said, I haven't actually run that code, so there may be additional errors, but you should be able to print that list to screen, or drop to a file, or whatever you need to do with it.
    – AllenG Aug 10 '10 at 15:44
  • @Allen: Hi Allen, do you see anything wrong with the code below? For some reason I never get the messagebox to pop up which means nothing is getting added to IdentityLines string line; while(!string.IsNullOrEmpty(line = sr.Read().ToString())) { if (line.StartsWith("Identity=")) { IdentityLines.Add(new KeyValuePair(file.Name, line)); MessageBox.Show("Your inside"); } – Josh Aug 10 '10 at 15:53
  • @Josh: It may depend on your files. I'm updating my code with something I've now actually compiled and run. – AllenG Aug 10 '10 at 16:10
  • Hi Allen- Your code works but I think there is a problem with .StartsWith for the string search. Each line in the log file has a rather long string and within that long string would be something like yadayada Identity="swn\smithj" yadayada So what I need to make a collection of is just the Identity="swn\smithj" It was picking up anything with .StartsWith but when I changed it to .Contains then it was picking up and returning the entire line not just Identity="swn\rodgert" – Josh Aug 10 '10 at 18:05
  • You may need to work with your search string, then. It's also possible that Regex will help with that (I can do a little regex, but there are others on SO who are far better). That said, looking at @Dan Tao's suggestion looks a lot cleaner than what I'm suggesting, assuming you're on VS2K8. – AllenG Aug 10 '10 at 18:27
  • Allen - Your solution ended up working. Do you know how I can get rid of duplicate values before I start printing the result out? – Josh Aug 10 '10 at 19:24
  • @Josh: define a duplicate value. Do you mean a second entry of the same "IDENTITY=XXXXX" with a given key, or just any duplicate entries for "IDENTIY=XXXXXX" regardless of key? – AllenG Aug 10 '10 at 19:34
  • Yeah sorry about that Allen. Within my large string Identity can be equal to "swn\smithj" or maybe "swn\rodgersb" etc etc. So when I'm printing out each of these results I sometimes have duplicates like Identity="swn\smithj" Identity="smithj" Does that help? I'm a little confused on what you meant. – Josh Aug 10 '10 at 19:48
  • @Josh-At this point, I'd say make this a new question: Reference this question, but then give a sample of the output you're getting that's in error and what you'd prefer to be getting. – AllenG Aug 10 '10 at 19:57
  • Yeah I wondered about asking that additional question before asking it. I'll try to make a new question and reference this question. Thanks again – Josh Aug 10 '10 at 20:25
2

Here's a pretty terse way to accomplish what you're after.

public static IEnumerable<string> GetSpecificLines(this DirectoryInfo dir, string fileSearchPattern, Func<string, bool> linePredicate)
{
    FileInfo[] files = dir.GetFiles(fileSearchPattern);

    return files
        .SelectMany(f => File.ReadAllLines(f.FullName))
        .Where(linePredicate);
}

Usage:

var lines = new DirectoryInfo("C:\Logs")
    .GetSpecificLines("ACCESS*", line => line.StartsWith("Identity="));
Dan Tao
  • 125,917
  • 54
  • 300
  • 447
  • Very nice use of linq. I wonder if chaining the Where() filter off ReadAllLines is more efficient? e.g. .SelectMany(f => File.ReadAllLines(f.FullName).Where(linePredicate)); This way you wouldn't potentially store a whole lot of data in a buffer prior to filtering. – Jacob Aug 10 '10 at 16:38
  • @Jacob: I see what you mean; but unless I'm mistaken, it really shouldn't make any difference. Since `SelectMany` and `Where` are lazily evaluated, the steps will be the same: the lines within each `string[]` array returned by `File.ReadAllLines` will be enumerated over individually, and only those matching `linePredicate` will be returned. If you step through the code in a debugger you'll see what I mean. – Dan Tao Aug 10 '10 at 16:49
  • @Jacob: (In other words what I'm saying is that you won't be storing "a whole lot of data in a buffer" either way -- except for the `string[]` arrays returned by `File.ReadAllLines`, again, either way -- because the Linq extension methods provide lazy evaluation.) – Dan Tao Aug 10 '10 at 16:51
  • Hi Dan- Where you are saying Usage: var lines = new DirectoryInfo("C:\Logs") when I place a period right after that last parantheses am I supposed to see GetSpecificLines as an option because I don't. I see many other options but not GetSpecificLines. Am i missing something? – Josh Aug 10 '10 at 17:36
  • @Josh: The example I provided happens to be an extension method. To get that to work you would define it from inside a static class. Then once you've compiled the project the method should appear in Intellisense. If you are using an older version of .NET, you will just have to make it a regular static method. – Dan Tao Aug 10 '10 at 23:11