1

I've been working on a small program for a friend of mine who has a very large file which I read into the a datagridview > Modify data > export to csv. I managed to make everything work relevatively well until recently when he asked me to make some changes to the way the data is exported. For some reason, I am getting an Out of Memory exception when running this function.

private void ExportData(int fileNum = 1, int rowCount = 0)
    {
        int lastRow = rowCount;
        if (!Directory.Exists(ExportPath + dataFilePath.Name))
            Directory.CreateDirectory(ExportPath + dataFilePath.Name);
        StreamWriter sw = new StreamWriter(ExportPath + dataFilePath.Name + @"\" + dataFilePath.Name + "_" + fileNum + ".csv");

        //var headers = dataGridView1.Columns.Cast<DataGridViewColumn>();
        //sw.WriteLine(string.Join(",", headers.Select(column => "\"" + column.HeaderText + "\"").ToArray()));
        sw.WriteLine("Unit,UPC,Brand,Vendor,List Cost,QTY,Price,Description,Attribute 1,Attribute 2," +
            "Descriptor 1,Descriptor 2,Descriptor 3,Descriptor 4,Descriptor 5,Descriptor 6,Descriptor 7,Descriptor 8");

        for (int i = 0; i < 50000; i++)
        {
            rowCount = lastRow + i;

            if (rowCount >= dataGridView1.RowCount)
                break;
            var cells = dataGridView1.Rows[rowCount].Cells.Cast<DataGridViewCell>();
            sw.WriteLine(string.Join(",", cells.Select(cell => "\"" + cell.Value + "\"").ToArray()));
        }

        sw.Close();
        sw.Dispose();
        lastRow = rowCount + 1;
        if (lastRow < dataGridView1.RowCount - 1)
            ExportData(fileNum + 1, lastRow);
        else
        {
            progressBar1.BeginInvoke(new MethodInvoker(delegate {
                progressBar1.Style = ProgressBarStyle.Blocks;
                button_OpenDataFile.Enabled = true;
                button_ConvertFromRaw.Enabled = true;
                button_exportLS.Enabled = true;

                Console.WriteLine("[Main] Export complete.");
            }));
        }
    }

var cells = dataGridView1.Rows[rowCount].Cells.Cast<DataGridViewCell>(); seems to be the line the error occurs on.

Could anyone provide any insight into what Im doing wrong?

Thank you!

  • i guess its going to infinite loop causing memory over flow. – user9405863 Mar 13 '18 at 03:45
  • 6
    Don't use recursion when working with large data sets. – Ron Beyer Mar 13 '18 at 03:45
  • It is not an infinite loop. That was my next thought, is it something to do with recursion. Ill switch it up and see what happens. Do you have any other suggestions? Also thank you :) – Chase Gober Mar 13 '18 at 03:48
  • How large is the csv file when the exception occurs? What does the progress bar do? It seems to change after the entire export is completed and does nothing when the csv is being written. Does code fail during the write of the csv or when the progress bar is being updated? – jdweng Mar 13 '18 at 03:57
  • It looks like you are exporting the same csv data over and over again by calling ExportData. Increasingg filenub by one doesn't change anything except the filename. You also are creating a lot of directories which may be the reason for the out of memory. – jdweng Mar 13 '18 at 04:06
  • It does not seem to matter how large or small the csv file is. I can set the row limit to 50k, which i often do for testing, or 500k which is what the client wants at production time. The line in particular on which the error is happening is here `var cells = dataGridView1.Rows[rowCount].Cells.Cast();` – Chase Gober Mar 13 '18 at 04:07
  • @jdweng the code works as expected. it creates csv files like so (AN_1,AN_2,AN_3...) it was only after I made changes to the column header structure that I get the oom error. It was using significant memory before this as well, but working as intended. I think now that I have even more data in my export, it crashes. – Chase Gober Mar 13 '18 at 04:10
  • What if you use `ToString()` instead of `.Cast` like this: https://stackoverflow.com/questions/9943787/exporting-datagridview-to-csv-file. What's the actual purpose of this? No human can edit 500,000 lines in a datagridview! – Nick.Mc Mar 13 '18 at 04:20
  • I don't care what the code did before the change. It is wrong and need to be fixed. I can write code bad code that can take a day to run and get proper results, when good code will only take 10 seconds to get the correct results. Which would you run?. The code that takes a day to run or the code that takes 10 seconds to run. – jdweng Mar 13 '18 at 04:21
  • Jdweng but there aren't excess directories being created nor is the same data being written to the files? Lastrow keeps track of which row to write. I agree with you, if it can be improved I'm just asking for suggestions! That's why I'm here:) – Chase Gober Mar 13 '18 at 04:28
  • Nick, they aren't actually editing the data in the gridview. Just preview before export. – Chase Gober Mar 13 '18 at 04:29
  • You can remove the `.ToArray()` which is redundant and will burn through some memory. The StreamWriter should be in a using block - that won't be part of the problem; but you should do it. – Richardissimo Mar 13 '18 at 06:46
  • again.... what human can preview and make sense of 50,000 lines of data? – Nick.Mc Mar 13 '18 at 09:44
  • Nick, they really only need to preview a very small sample. The datagridview is bound to an object called HTDataObject. Should I loop through it instead of the rows of the datagridview? – Chase Gober Mar 13 '18 at 13:24

2 Answers2

0

Do this experiment: convert your code to a loop, instead of using recursion:

private void ExportData()
{
    //You only need to do this once, take it out of the loop.
    if (!Directory.Exists(ExportPath + dataFilePath.Name))
        Directory.CreateDirectory(ExportPath + dataFilePath.Name);

    var fileNum = 0;
    var rowCount = 0;

    while (rowCount < dataGridView1.RowCount)
    {
        fileNum = fileNum + 1;

        using (StreamWriter sw = new StreamWriter(ExportPath + dataFilePath.Name + @"\" + dataFilePath.Name + "_" + fileNum + ".csv")
        {
            sw.WriteLine("Unit,UPC,Brand,Vendor,List Cost,QTY,Price,Description,Attribute 1,Attribute 2," +
                "Descriptor 1,Descriptor 2,Descriptor 3,Descriptor 4,Descriptor 5,Descriptor 6,Descriptor 7,Descriptor 8");

            for (int i = 0; i < 50000; i++)
            {
                rowCount = rowCount + 1;

                if (rowCount >= dataGridView1.RowCount)
                    break;
                var cells = dataGridView1.Rows[rowCount].Cells.Cast<DataGridViewCell>();
                sw.WriteLine(string.Join(",", cells.Select(cell => "\"" + cell.Value + "\"").ToArray()));
            }
        }   //sw.Close() and sw.Dispose() not needed because of the 'using'. You may want to do sw.Flush().
    }

    //The 'else' part of your original recursive method
    progressBar1.BeginInvoke(new MethodInvoker(delegate {
        progressBar1.Style = ProgressBarStyle.Blocks;
        button_OpenDataFile.Enabled = true;
        button_ConvertFromRaw.Enabled = true;
        button_exportLS.Enabled = true;

        Console.WriteLine("[Main] Export complete.");
    }));
}

Does the error go away? Probably yes. Recursion uses a lot of memory in a stack and does not release it until the end of the recursion, when going up the stack again. The line where you get the error just happens to try and add to memory the contents of a whole line in your csv file. That may be the last drop that causes the out of memory exception if the memory is already almost full with the recursion stack.

I removed some accumulator vars that seemed redundant, I hope I didn't mess up with the ranges of the loops.

I removed the CreateDirectory from the loop, and added a using statement for the StreamWriter. I don't think those were the reasons for your error, as the directory was created only once, and you were disposing the StreamWriter before the recursive call, but anyway if you want to confirm it you can try undoing those changes in the non-recursive code one by one, and see if the error happens again.

Diana
  • 2,186
  • 1
  • 20
  • 31
  • Thank you for helping me! I implemented it as you described, and it does make sense what you are saying. For some reason, the memory is not being released during ExportData. I can watch it climb all the way to near 4gb until finally, it crashes. :\ – Chase Gober Mar 14 '18 at 00:10
  • Glad it helped! – Diana Mar 14 '18 at 15:02
0

So I figured it out.
I guess iterating through the datagridview just isn't the way to go.
Instead, I just exported the data using my data source. It goes a hell of a lot faster... down from 2 minutes to about 2 seconds.

Thank you, everyone, for the help!

private void ExportData()
    {
        //You only need to do this once, take it out of the loop.
        if (!Directory.Exists(ExportPath + dataFilePath.Name))
            Directory.CreateDirectory(ExportPath + dataFilePath.Name);

        var fileNum = 0;
        var rowCount = 0;

        while (rowCount < dataGridView1.RowCount)
        {
            fileNum = fileNum + 1;

            using (StreamWriter sw = new StreamWriter(ExportPath + dataFilePath.Name + @"\" + dataFilePath.Name + "_" + fileNum + ".csv"))
            {
                sw.WriteLine("Unit,UPC,Brand,Vendor,List Cost,QTY,Price,Description,Attribute 1,Attribute 2" +
                "Descriptor 1,Descriptor 2,Descriptor 3,Descriptor 4,Descriptor 5,Descriptor 6,Descriptor 7,Descriptor 8");

                for (int i = 0; i < 50000; i++)
                {
                    rowCount = rowCount + 1;

                    if (rowCount >= dataGridView1.RowCount)
                        break;
                    var s = new string[]
                    {
                        "\"" + DATA[rowCount].Unit + "\"",
                        "\"" + DATA[rowCount].UPC + "\"",
                        "\"" + DATA[rowCount].Brand + "\"",
                        "\"" + DATA[rowCount].Vendor + "\"",
                        "\"" + DATA[rowCount].List_Cost + "\"",
                        "\"" + DATA[rowCount].Quantity.ToString() + "\"",
                        "\"" + DATA[rowCount].Price + "\"",
                        "\"" + DATA[rowCount].Description + "\"",
                        "\"" + DATA[rowCount].Attribute_1 + "\"",
                        "\"" + DATA[rowCount].Attribute_2 + "\"",
                        "\"" + DATA[rowCount].Descriptor_1 + "\"",
                        "\"" + DATA[rowCount].Descriptor_2 + "\"",
                        "\"" + DATA[rowCount].Descriptor_3 + "\"",
                        "\"" + DATA[rowCount].Descriptor_4 + "\"",
                        "\"" + DATA[rowCount].Descriptor_5 + "\"",
                        "\"" + DATA[rowCount].Descriptor_6 + "\"",
                        "\"" + DATA[rowCount].Descriptor_7 + "\"",
                        "\"" + DATA[rowCount].Descriptor_8 + "\""
                    };
                    sw.WriteLine(string.Join(",", s));
                    sw.Flush();
                }
            }   //sw.Close() and sw.Dispose() not needed because of the 'using'. You may want to do sw.Flush().
        }

        //The 'else' part of your original recursive method
        progressBar1.BeginInvoke(new MethodInvoker(delegate
        {
            progressBar1.Style = ProgressBarStyle.Blocks;
            button_OpenDataFile.Enabled = true;
            button_ConvertFromRaw.Enabled = true;
            button_exportLS.Enabled = true;

            Console.WriteLine("[Main] Export complete.");
        }));
    }'