0

I'm developing in .Net 3.5 and I have a little big issue that if we resolve, we can help other people with same problem.

I'm having issues running out of memory. I need to read a big text file, around 500mb and 13 millions of lines, each line must be splitted with ; to get the values of the line (around 7 values per line) and load all the values into a DataTable.

I don't know how can I read and load it, without get all the memory of the system.

My PC have 8gb of ram and it gets full.

Thanks

DataTable dt = new DataTable();

dt.Columns.Add("Months");
dt.Columns.Add("WHS");
dt.Columns.Add("BRICK");
dt.Columns.Add("DAY");
dt.Columns.Add("SALES TYPE");
dt.Columns.Add("FCC");
dt.Columns.Add("UNITS");

String line;

using (FileStream fs = File.Open("path", FileMode.Open, FileAccess.Read, FileShare.ReadWrite))
using (BufferedStream bs = new BufferedStream(fs))
using (StreamReader sr = new StreamReader(bs))
{
    while ((line = sr.ReadLine()) != null)
    {
        string[] parts = line.Split(';');
        dt.Rows.Add(parts[0], parts[1], parts[2], parts[3], parts[4], parts[5], parts[6]);
    }
    dataGridView1.DataSource = dt;
}

TEMPORAL SOLUTION

You need to clear de DataTable with a counter (I do all of this in a BackGroundWorker), for example:

int counter = 0;
while ((line = sr.ReadLine()) != null)
{
    string[] parts = line.Split(';');
    dt.Rows.Add(parts[0], parts[1], parts[2], parts[3], parts[4], parts[5], parts[6]);
    counter++;
    dataGridView1.DataSource = dt;

    if(counter >= 500000)
    {
    dt.Clear();
    counter = 0;   
    }
}
AbelOrtiz
  • 17
  • 2
  • 3
    Data tables have a LOT of memory overhead ([some tests indicate over 4X overhead](http://stackoverflow.com/questions/424598)) - you might be better off creating a list of a custom class rather than filling a DataTable. – D Stanley Apr 14 '16 at 13:20
  • try this: `dt.BeginLoadData(); foreach (var line in File.ReadLines("")) dt.Rows.Add(line.Split(new[] {';'})); dt.EndLoadData();` – Hamid Pourjam Apr 14 '16 at 13:22
  • perhaps adding the machine settings which you are using can also shed some light – Veverke Apr 14 '16 at 13:57
  • Also a strings will use `20+(n/2)*4 bytes`. At 13 million lines with 7 values per line that could very well lead up to another GByte overhead. – Adwaenyth Apr 14 '16 at 13:58
  • by the way is there any reason why to read the file using `FileStream` object family instead of simply `File.ReadAllLines()` ? You are treating your content as text anyway. Although this is nearly insignificant, you will spare the creation and management of `FileStream`, `BufferedStream` and `StreamReader` objects. – Veverke Apr 14 '16 at 14:00
  • 2
    From another point of view, isn't there a flaw on the UI? I mean, loading 13 million entries in a data grid. Shouldn't you implement some kind of search functionality to narrow down the result? – PhilDulac Apr 14 '16 at 14:01
  • 1
    I would suggest to use buffering while reading with a buffer size of around 4K or 8K. Next process . – Ravi Tiwari Apr 14 '16 at 14:01
  • Well, Ravi just gave one good reason why not to use `File.ReadAllLines` - in case one wants to buffer the reading. `File.ReadAllLines` will read load the whole thing, I guess. – Veverke Apr 14 '16 at 14:03
  • @PhilDulac has a point. Any dataGirdView that hosts 13 million rows with 7 columns will be horribly slow anyways, even if your RAM can manage it... what exaclty are you trying to achieve with that amount of data in a UI? – Adwaenyth Apr 14 '16 at 14:14
  • thanks all for the answers, when I try this method, the memory of my pc increase from 2gb to the maximum and crash the PC. I need read the file with the minimum usage of RAM and I don't know how. I need to manage all the data from the file in a DataGridView to compare data. It is my first time with big files and with C# – AbelOrtiz Apr 18 '16 at 11:56
  • If you want load all data at once, I suggest to load all data to an list and set it as DataSource of DataGridView. `foreach(line...) myList.Add(new {Months=splittedLine[0], ...}); – mehrdad safa Apr 19 '16 at 08:37
  • I think the dataGridView can better handle data by using this approach. – mehrdad safa Apr 19 '16 at 08:38
  • So... you solved the problem by throwing data away. If you're going to throw them away anyway, why load them and keep them in memory in the first place? – Luaan Apr 19 '16 at 10:20
  • @Luaan true, I will try it. Thanks for all the answers – AbelOrtiz Apr 19 '16 at 13:16

0 Answers0