0

I'm developing a file reader for a complex format. There are literally hundreds of different entries. In the way I'm doing it now I need to use two Streamreaders because I need to extract some information before. These files are big enough to not read them at once.

What I want is to notify the user what lines have not been read. My structure is like this:

Streamreader file1 = new Streamreader(path);
while((line=file1.Readline()) != null)
{
     if(line.StartsWith("HELLO")
{
//...
}
//... more conditions
}



Streamreader file2 = new Streamreader(path);
while((line=file2.Readline()) != null)
{
     if(line.StartsWith("GOOD MORNING")
{
//...
}
//...more conditions 
}

So if my reader was perfect at the end all lines are read. As things can be bizarre some entries can be not yet implemented, and I want to catch that lines. The problem here, as you see, is having two StreamReaders.

My options are:

  1. Store in a array all not read lines and then use it for the second reading, subtracting line by line after reading it. Not good because I will be storing several thousand of lines there.
  2. Add all conditions in the second StreamReader to the first (all added) so this way I will know what lines are going to be read the second time. Better than previous but I need to modify my code in several places to make it run properly. I mean, when I wanted to implement the reading a new entry (second StreamReader) I will need to modify the first StreamReader too.

Is there any suggestion or any better way of doing this?

Sturm
  • 3,968
  • 10
  • 48
  • 78
  • What's wrong with keeping track of the line number on each read-through, and add the line number to a collection if it doesn't meet any of your conditions? – Brian Snow Mar 10 '14 at 07:01

2 Answers2

0

I would create some predicates function translating line, i.e.:

class PredicateResult{
    public En_LineType type;
    public String data;
}

private PredicateResult FirstReader(String line){
    if(line.StartsWith("HELLO")){
        return new PredicateResult{
            type = En_LineType.Hello,
            data = ...
        }
    }
}

That way, you have two functions which can be used to check if line matches to any of them. Additionaly you can easly change condition on which you are matching line and you can support different formats.

Ari
  • 3,101
  • 2
  • 27
  • 49
0

There are many string searching algorithms. Most of them uses hashing with Windowing Algorithm, which you can get an idea from What is Sliding Window Algorithm? Examples?

Each algorithm has slight differences in general complexity or in worst case scenarios etc. You could pick one which you decide that suits most into your application:

Rabin–Karp algorithm

Aho–Corasick string matching algorithm

Knuth–Morris–Pratt algorithm

Boyer–Moore string search algorithm

Community
  • 1
  • 1
stratovarius
  • 3,720
  • 1
  • 30
  • 26