2

I have a text file which contains a sequence of price Data. The problem could match any long history of historical data such as Temperature, Air Humidity, Prices, Logfiles, ...

The Head of my history file looks like the following:
enter image description here

If I want to read and process a file too large for memory, I would normally choose the following code:

using (FileStream fs = File.Open(path, FileMode.Open, FileAccess.Read, FileShare.ReadWrite))
using (BufferedStream bs = new BufferedStream(fs))
using (StreamReader sr = new StreamReader(bs))
{
    string line;
    while ((line = sr.ReadLine()) != null)
    {
        // Process Data
    }
}

In my case a record is created every 1000ms. The most recent data is at the end of the file. The issue arises when trying to process the most recent data.

Example:
I want to generate an average of the last 30 days.
It would be most efficient to start at the end of the file and move towards the beginning until the X days threshold is met. The sample code above would read through the whole file which is barely usable in this scenario. A worst-case every time I need to update recent data indicators. This issue of course applies to any operation where you want to process the last x elements.

Is there a functionality to read from end to start of the file?

julian bechtold
  • 1,875
  • 2
  • 19
  • 49

2 Answers2

4

Try following code. The last line could be blank. Wasn't sure best way of handling the last line being blank.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;

namespace GetFileReverse
{
    class Program
    {
        const string FILENAME = @"c:\temp\test.txt";
        static void Main(string[] args)
        {
            GetFileReverse getFileReverse = new GetFileReverse(FILENAME);
            string line = "";
            while ((line = getFileReverse.ReadLine()) != null)
            {
                Console.WriteLine(line);
            }
        }
    }
    public class GetFileReverse : IDisposable
    {
        const int BUFFER_SIZE = 1024;
        private FileStream stream { get; set; }
        private string data { get; set; }
        public Boolean SOF { get; set; }
        private long position { get; set; }
        public GetFileReverse(string filename)
        {
            stream = File.OpenRead(filename);
            if (stream != null)
            {
                position = stream.Seek(0, SeekOrigin.End);
                SOF = false;
                data = string.Empty;
            }
            else
            {
                SOF = true;
            }
        }
        private byte[] ReadStream()
        {
            byte[] bytes = null;
            int size = BUFFER_SIZE;
            if (position != 0)
            {
                bytes = new byte[BUFFER_SIZE];
                long oldPosition = position;
                if (position >= BUFFER_SIZE)
                {
                    position = stream.Seek(-1 * BUFFER_SIZE, SeekOrigin.Current);
                }
                else
                {
                    position = stream.Seek(-1 * position, SeekOrigin.Current);
                    size = (int)(oldPosition - position);
                    bytes = new byte[size];
                }
                stream.Read(bytes, 0, size);
                stream.Seek(-1 * size, SeekOrigin.Current);
            }
            return bytes;

        }
        public string ReadLine()
        {
            string line = "";
            while (!SOF && (!data.Contains("\r\n")))
            {
                byte[] bytes = ReadStream();
                if (bytes != null)
                {
                    string temp = Encoding.UTF8.GetString(bytes);
                    data = data.Insert(0, temp);
                }
                SOF = position == 0;
            }


            int lastReturn = data.LastIndexOf("\r\n");
            if (lastReturn == -1)
            {
                if (data.Length > 0)
                {
                    line = data;
                    data = string.Empty;
                }
                else
                {
                    line = null;
                }
            }
            else
            {
                line = data.Substring(lastReturn + 2);
                data = data.Remove(lastReturn);
            }

            return line;
        }
        public void Close()
        {
            stream.Close();
        }
        public void Dispose()
        {
            stream.Dispose();
            data = string.Empty;
            position = -1;
        }
    }
}
jdweng
  • 33,250
  • 2
  • 15
  • 20
  • thank you for that. it works with the addition of `if(line == "") continue;` But unfortunately, eventhough I call `getFileReverse.Close()` and `getFileReverse.Dispose()` the file seems to get locked – julian bechtold Dec 27 '19 at 19:48
  • Are you using my latest code? Made some improvements after my initial posting. If code hangs then use VS menu Debug : BreakAll to find where it hangs. The code may not work if your file doesn't have both "\r\n", or the encoding is unicode. I did a lot of debugging. I may not be handling a boundary condition properly like greater than instead of greater than and equal (or less than). – jdweng Dec 27 '19 at 21:44
  • 1
    Instead of disposing wrapping in a using statement may do a better job of disposing : using (GetFileReverse getFileReverse = new GetFileReverse(FILENAME)){ enter code here} – jdweng Dec 27 '19 at 21:48
  • hey, thank you for your code. Its great! Unfortunately my question was marked as answered eventhough the other thread does not provide a working answer to my question. – julian bechtold Jan 03 '20 at 10:23
2

You can use Seek to go to the end of the file, however you will need to "guess" or calculate how far from the end to go... for example to read the last 1024 bytes:

    stream.Seek(-1024, SeekOrigin.End);

Just work out how many bytes the last 30 rows could be at maximum and seek to that far before the end of the file, then only read that portion of the file

Milney
  • 6,253
  • 2
  • 19
  • 33
  • If I understood you correctly, I do `Filestream fs = ...` then `fs.Seek(x)` and then as usual sr.ReadLine() ? – julian bechtold Dec 27 '19 at 11:32
  • Yes indeed, but as I mentioned you need to calculate the correct value to pass to Seek() to get to the place you want to start reading from – Milney Dec 27 '19 at 11:33