48

I am exporting a 1200 X 800 matrix (indexMatrix) to a excel file using the standard Microsoft.Office.Interop.Excel. The app works, just that it is really really really slow( even for the 100 x 100 matrix) . I also export in a text file through a TextWriter an it works almost instantly . Is there any way to export to the excel file faster?

Here is my code :

        Excel.Application xlApp=new Excel.Application();
        Excel.Workbook xlWorkBook;
        Excel.Worksheet xlWorkSheet;
        object misValue = System.Reflection.Missing.Value;

        //xlApp = new Excel.ApplicationClass();
        xlWorkBook = xlApp.Workbooks.Add(misValue);

        xlWorkSheet = (Excel.Worksheet)xlWorkBook.Worksheets.get_Item(1);
        for (int i = 0; i < 800; i++)   //h
            for (int j = 0; j < 1200; j++)
                xlWorkSheet.Cells[i+1,j+1] =indexMatrix[i][j];


        xlWorkBook.SaveAs("C:\\a.xls", Excel.XlFileFormat.xlWorkbookNormal, misValue, misValue, misValue, misValue, Excel.XlSaveAsAccessMode.xlExclusive, misValue, misValue, misValue, misValue, misValue);
        xlWorkBook.Close(true, misValue, misValue);
        xlApp.Quit();

        releaseObject(xlWorkSheet);
        releaseObject(xlWorkBook);
        releaseObject(xlApp);

        MessageBox.Show("Excel file created , you can find the file c:\\csharp-Excel.xls");
Paul Sasik
  • 79,492
  • 20
  • 149
  • 189
Alex
  • 10,869
  • 28
  • 93
  • 165

8 Answers8

77

You are updating individual cells. That's going to be very slow. If you think about it, each time you update a cell, an RPC call will be marshalled to the Excel process.

It will be much faster if you assign your two dimensional array of values to an Excel Range of the same dimensions in a single statement (one cross-process call) instead of your current 1200 x 800 = 960,000 cross-process calls.

Something like:

// Get dimensions of the 2-d array
int rowCount = indexMatrix.GetLength(0);
int columnCount = indexMatrix.GetLength(1);
// Get an Excel Range of the same dimensions
Excel.Range range = (Excel.Range) xlWorkSheet.Cells[1,1];
range = range.get_Resize(rowCount, columnCount);
// Assign the 2-d array to the Excel Range
range.set_Value(Excel.XlRangeValueDataType.xlRangeValueDefault, indexMatrix);

Actually, to be pedantic, there are three cross-process calls in the above code (.Cells, .get_Resize and .set_Value), and there are two calls per iteration in your code (.Cells get and an implicit .set_Value) for a total of 1200 x 800 x 2 = 1,920,000.

Note range.get_Resize and range.set_Value were needed for an old version of the Excel interop library I was using when this post was first authored. These days you can use range.Resize and range.Value as noted in the comment by @The1nk.

Joe
  • 122,218
  • 32
  • 205
  • 338
  • there seems to be a problem at .get_Resize . It seems to not exist. – Alex Oct 21 '10 at 16:13
  • at excel.range line , the following error appears ''System.__ComObject' does not contain a definition for 'get_Resize'' – Alex Oct 21 '10 at 16:17
  • I think the edit will fix it. Cells[1,1] does not return a Range object - so it needs to be cast. – Joe Oct 21 '10 at 16:28
  • sorry to bother you again but i got another error : at range.set_value line appeared the error : " SafeArrayTypeMismatchException : Specified array was not of the expected type." – Alex Oct 21 '10 at 16:36
  • 1
    Your code doesn't show what type it is. It should be an object array, i.e. object[,] – Joe Oct 21 '10 at 17:05
  • 1
    You'll need to copy the data from your jagged array (int[][]) to a rectangular array: object[,] or maybe int[,] would work too. – Joe Oct 21 '10 at 17:43
  • 5
    Beautiful solution, worked tremendously for my 44,279 cells. From 5 minutes to 3 seconds. One note for future visitors - You don't need to use `get_Resize` or `set_Value`. You just use `cell.Resize` and `cell.Value` as normal. – The1nk Aug 31 '16 at 15:04
  • 5
    "If you think about it, each time you update a cell, an RPC call will be marshalled to the Excel process." That will probably still happen even if you don't think about it. – B. Clay Shannon-B. Crow Raven Nov 04 '16 at 21:30
  • @The1nk I got similar results. It's amazing how much a simple change can affect performance! –  Nov 28 '16 at 14:22
  • 2
    Thanks, but gives `HRESULT: 0x800A03EC` exception. – Artfaith Aug 28 '20 at 19:24
  • How would I do this if I'm working with objects instead of an array? Or even better, a List<> of objects? My objects have multiple property types (string, double, DateTime). – Smitty-Werben-Jager-Manjenson Jul 15 '21 at 20:42
  • @Smitty-Werben-Jager-Manjenson did you get any solution about how to use this with List<> of objects? – Oxygen Jun 05 '22 at 07:23
  • @Oxygen, I don't have access to my old code anymore but if I recall, I converted my lists to arrays. – Smitty-Werben-Jager-Manjenson Jun 06 '22 at 15:38
16

Excel interop is never going to be fast. You're basically remote-controlling an instance of the Excel application. You might have more success by creating a CSV file and then using Excel interop to convert this to a .xls or .xlsx file

Yuliy
  • 17,381
  • 6
  • 41
  • 47
  • 1
    +1 Nice approach, and this ensures to be faster as well, as a CSV file is more likely a text file, so that `TextWriter` class can handle it. – Will Marcouiller Oct 21 '10 at 15:42
  • Yup, that's my preferred method as well. Although you run into problems when cell values contain newlines, as that screws up your row index. Anyone have any tips for this issue? – Yevgeniy Jun 14 '13 at 16:57
  • 1
    In reply to my above comment: use import-csv (powershell) or an equivalent library function that will handle the messy situations with qualifier chars and newlines used in field values. – Yevgeniy Jun 14 '13 at 18:19
13

I had similar problems when reading an extremely large excel file and it took over 2 hours using interop.

I tried using ClosedXml and the process took less than 10 seconds. ClosedXml

// To loop 
Sheet.Row(y).Cell(x).Value

Also keep in mind interop will not work on your server unless you have excel installed. ClosedXml does not need excel installed.

MIKE
  • 1,039
  • 7
  • 24
  • First time to use ClosedXML and it's REALLY fast. I used InsertData with an array of arrays and an argument. Very smooth compared to HSSF and HSSF craziness of POI and the COM Exceptions of interop!!! +1 – Lzh Jul 19 '15 at 07:08
  • 1
    ClosedXML is faster because it uses OpenXML, which means it is writing the file directly. There are no intermediate APIs, no UI to update, etc. If it is an option for your use case(s), it is probably always going to be the fastest option, and you could do the same with just normal OpenXML. – Chris Jul 06 '18 at 15:47
  • My 2 cents. Using Interop: 39 seconds, using ClosedXml: 0.7 seconds. – Mario Nov 13 '20 at 12:24
7

Turn off ScreenUpdating before writing any data, Application.ScreenUpdating = FALSE then turn on at end of code = TRUE

Fjodr
  • 919
  • 13
  • 32
Paul Roberts
  • 81
  • 1
  • 1
2

ClosedXML is a miracle, it's a great deal faster and easier to use.

var workbook = new XLWorkbook();//create the new book

var worksheet = workbook.Worksheets.Add("Computer Install");// Add a sheet
worksheet.Cell(1,1).Value = "PC Name";// (Row, column) write to a cell

workbook.SaveAs(@"LIC documents.xlsx");// Save the book

You install using a nu Get package. https://www.nuget.org/packages/ClosedXML

Massimiliano Kraus
  • 3,638
  • 5
  • 27
  • 47
Chong Ching
  • 425
  • 3
  • 7
  • Yup, it is impressive. Went from about 4.5 hours to 5 minutes. Github: https://github.com/closedxml/closedxml – Andres May 31 '17 at 13:08
  • 1
    ClosedXML is faster because it uses OpenXML, which means it is writing the file directly. There are no intermediate APIs, no UI to update, etc. If it is an option for your use case(s), it is probably always going to be the fastest option, and you could do the same with just normal OpenXML. – Chris Jul 06 '18 at 15:48
2

Use Value2 to make it fast; Show excel before filling data

Siarhei Kuchuk
  • 5,296
  • 1
  • 28
  • 31
1

I am answering a little bit late sorry. I was working on my project and we had to use interop excel. And our data was too big, which was taking more than 1 minute with interop excel. We tried something else which is copy the all content of datagridview to clipboard, open an excel worksheet using interop excel, and paste the content to excel. It takes less than 1 second and exports our data perfectly.

DataGridView to string:

    var newline = System.Environment.NewLine;
    var tab = "\t";
    var clipboard_string = "";

    foreach (DataGridViewRow row in dgProjeler.Rows)
    {
        for (int i = 0; i < row.Cells.Count; i++)
        {
            if (i == (row.Cells.Count - 1))
                clipboard_string += row.Cells[i].Value + newline;
            else
                clipboard_string += row.Cells[i].Value + tab;
        }
    }

and copy the string to clipboard

        Clipboard.SetText(clipboard_string);

And open a worksheet, paste the content.

        Excel.Application app = new Excel.Application();
        app.Visible = true;
        Excel.Workbook wb = app.Workbooks.Add(1);
        Excel.Worksheet ws = (Excel.Worksheet)wb.Worksheets[1];
        // changing the name of active sheet
        ws.Name = "Exported from gridview";

        ws.Rows.HorizontalAlignment = HorizontalAlignment.Center;
        app.ActiveWindow.Activate();


        ws.Activate();
        ws.Paste();

        ws.Cells.EntireColumn.AutoFit();

It works perfectly on me, I hope it helps people who still couldn't find the solution.

  • I was able to forgo the app.Visible, app.ActiveWindow.Activate(), and ws.Activate() and it still worked perfectly on my tab- and newline-delimited string. Thanks so much! – Eric Eggers Nov 02 '22 at 18:56
0

There are three ways to do this, 2 of which are mentioned in different answers by others:

  1. Directly set the value of a range in excel to the 2D array.
  2. Write data to a CSV file, then use interop to save the CSV file as an xls or xlsx file.
  3. Write data to a CSV file, then use the data connections feature to use the CSV as a data source and import the data.

All the three methods above are very fast. I could write data with a size of 90000 rows and 100 columns in around 6 seconds.

P.S. They didn't however solve my problem with formatting the data for borders, font styles, colors, cell merging etc.

Sanjit
  • 3
  • 4