1

I have a CSV file that I need to read all values from. But only from the last row which is newest. But I can't figure it out how to do it.

Im using streamreader now but that loops through the whole CSV file which can be very big.

I need to read two diffrent csv files(headers are the same but on diffrent location).

But how can I edit this code to only read from the last row? I have searched around but could not find anything that are the same as mine.

So this is how I been doing right now:

using (var reader = new StreamReader(path))
            {
                var headerLine = reader.ReadLine();
                var headerValues = headerLine.Split(',');
                var serialNumber = headerValues[66];

                while (!reader.EndOfStream)
                {
                    var line = reader.ReadLine();
                    var values = line.Split(',');

                    if (values[0] != "Index")
                    {
                        var analyseReport = new AnalyseReport();
                        analyseReport.SerialNumber = serialNumber;

                        foreach (var item in headerValues)
                        {
                            var headerName = item.Trim();
                            var index = Array.IndexOf(headerValues, item);

                            switch (headerName)
                            {
                                case "Index":
                                    analyseReport.Index = Convert.ToInt32(values[index].Trim());
                                    break;

                                case "Time":
                                    analyseReport.TimeStamp = Convert.ToDateTime(values[index].Trim());
                                    break;

                                case "Reading No":
                                    analyseReport.ReadingNo = Convert.ToInt32(values[index].Trim());
                                    break;
                                case "Duration":
                                    analyseReport.Duration = values[index].Trim();
                                    break;

                                case "Type":
                                    analyseReport.Type = values[index].Trim();
                                    break;

                                case "Units":
                                    analyseReport.Units = values[index].Trim();
                                    break;

                                default:
                                    break;
                            }
                        }

                        analyseReportList.Add(analyseReport);

                    }
                }
                return analyseReportList;
            }

2 Answers2

3

If you don't have fixed length lines, but can determine an upper limit of the length of your lines, you could use a heuristic to make reading the last line way more efficient.

Basically what to do is to move the current position of the file stream to n bytes from the end and then read until you've reached the last line.

private string ReadLastLine(string fileName, int maximumLineLength)
{
    string lastLine = string.Empty;

    using(Stream s = File.OpenRead(path))
    {
        s.Seek(-maximumLineLength, SeekOrigin.End);

        using(StreamReader sr = new StreamReader(s))
        {
            string line;

            while((line = sr.ReadLine()) != null)
            {
                lastLine = line;
            }
        }
    }

    return lastLine;
}

For a 1.8 MB CSV it's about 100-200 times faster than the File.ReadLines method. Anyway, at the cost of a way more complicated code.

Remarks: While you could use this method to optimize reading the last line, if it is really the bottleneck, please use the way clearer and cleaner version of Tim Schmelter, if at any rate possible. Please remember the second rule of optimization: "Don't do it, yet."

Futhermore, to determine the maximum length of your lines, you have to be careful and consider the character encoding. Rather over- than under-estimate the line length.

Paul Kertscher
  • 9,416
  • 5
  • 32
  • 57
2

You could use File.ReadLines(path).Last():

string headerLine = File.ReadLines(path).First();
string lastLine = File.ReadLines(path).Last();
// ... (no loop)

These are LINQ methods so you need to add using System.Linq;.

Tim Schmelter
  • 450,073
  • 74
  • 686
  • 939
  • But wouldn't this still read all lines, which is what the OP tried to avoid? – Paul Kertscher Feb 19 '18 at 07:43
  • 1
    @PaulKertscher: you can't avoid that, but he would not _process_ all lines (as currently) and he would keep his code concise and readable. `Enumerable.First` does only read the first line, `Last` will "read" all lines but don't do anything with them, it only returns the last. – Tim Schmelter Feb 19 '18 at 07:43
  • Thanks, I will try this out. – AndreasPettersson Feb 19 '18 at 07:55
  • 1
    @AndreasPettersson: keep in mind that you should use `ReadLines` not `ReadAllLines`. The latter indeed read all lines into memory. – Tim Schmelter Feb 19 '18 at 07:59
  • "you can't avoid that" - why is that? You can read stream backwards until you meet newline character. – Evk Feb 19 '18 at 08:04
  • 1
    @Evk: you can try Jon Skeets approach which i havent tested. I doubt it makes your code more readable: https://stackoverflow.com/a/452945/284240 I also doubt that this is what OP really wanted. He just wanted to avoid processing all lines in the loop. – Tim Schmelter Feb 19 '18 at 08:09
  • Not sure why it matters if it will make code more readable or not. OP wants to avoid reading whole (potentially huge) CSV file from hard drive for just 1 line (at least I think so), not to make code readable. And your solution (which is better than current OPs solution of course) still reads whole file from disk. – Evk Feb 19 '18 at 08:13
  • @Evk: I'm sure that the question arose because OP had a performance issue reading **and** processing all lines. If you look at his code you'll see that he splits all lines and probably also consume all(building an `AnalyseReport` for each). If you just do that with the last row the code will be much more efficient. Problem solved. As a side-effect his code will also become more readable – Tim Schmelter Feb 19 '18 at 08:16
  • Well maybe you are right. But if so happens that OP has 10 GB CSV files and that still not be fast enough - good to let him know there are even faster approaches. – Evk Feb 19 '18 at 08:19
  • I have tried Tim Schmelter's code and it seems to work good with 11000 rows in the CSV file. – AndreasPettersson Feb 19 '18 at 11:08