0

I've got a system that processes some large CSV files.

The scenario has now arisen where these files might have a number of none-delimited, worthless lines preceding the actual comma-delimited content.

The approach I have taken is to create a temporary reader to ascertain the number of superfluous lines and then move the working TextReader on that number of lines ready to be processed.

My code is as follows:

private static TextReader PrepareReader(TextReader reader)
    {
        // Variables
        TextReader tmpReader = reader;
        Int32 superfluousLineCount = 0;

        // Determine how many useless lines we have
        using (tmpReader)
        {
            string line;
            string headerIdentifier = "&1,";
            while ((line = tmpReader.ReadLine()) != null)
            {
                // Check if the line starts with the header row identifier
                if (line.Substring(0, 3) != headerIdentifier)
                {
                    // Increment the superfluous line counter
                    superfluousLineCount++;
                }
                else
                {
                    break;
                }
            }
        }

        // Move the source reader through how many lines we want to ignore
        using (reader)
        {
            for (int i = superfluousLineCount; i > 0; i--)
            {
                reader.ReadLine();
            }
        }

        // Return
        return reader;
    }

However, the reader.ReadLine(); in this part of the code:

for (int i = superfluousLineCount; i > 0; i--)
{
reader.ReadLine();
}

...throws the following exception

Cannot read from a closed TextReader. ObjectDisposedException in mscorlib Method: Void ReaderClosed()

Stack Trace: at System.IO.__Error.ReaderClosed() at System.IO.StreamReader.ReadLine() at CsvReader.PrepareReader(TextReader reader) in CsvReader.cs:line 93

Any advice greatly appreciated. Also, is the best way to go about my challenge?

Notes: Framework 2.0

Thanks.

Ste
  • 1,136
  • 2
  • 10
  • 30

2 Answers2

7

When you are using using (tmpReader) it will close tmpReader (Which references the same object as reader does), so when you try to read from reader in your loop, it is closed.

Your best bet is to combine the two loops. Sine you only want to skip lines, I would think the logic of the first loop is sufficient.

Attila
  • 28,265
  • 3
  • 46
  • 55
  • 1
    Right. This line `TextReader tmpReader = reader;` is completely meaningless. – Kirk Woll Jun 14 '12 at 15:08
  • Thanks, that makes perfect sense now it's been pointed out. Is tehre a way to *copy* a TextReader without interrogating the file again? – Ste Jun 14 '12 at 15:13
  • Check out [this SO thread](http://stackoverflow.com/questions/831417/how-do-you-reset-a-c-sharp-net-textreader-cursor-back-to-the-start-point) for what you could try – Attila Jun 14 '12 at 15:18
0

I think you simply have to do this (normalize/correct it, i made some simplifications without any compile or test):

    // edit
    private static TextReader PrepareReader(TextReader reader, out string outLine)
    {



            string line;
            string headerIdentifier = "&1,";
            while ((line = reader.ReadLine()) != null)
            {
                // Check if the line starts with the header row identifier
                if (line.Substring(0, 3) != headerIdentifier)
                {
                    // ... do nothing
                }
                else
                {
                    // edit
                    outLine = line;
                    break;
                }
            }

    }

IOW use the input reference, and move the reader where you want to.

Be aware to close your reader outside this method

Marcello Faga
  • 1,134
  • 8
  • 12
  • But with this, won't I already have passed the first line that I will want to process? – Ste Jun 14 '12 at 15:17