8

I have problem with C# Interop Excel get valid range with big file

https://www.dropbox.com/s/betci638b1faw8g/Demo%20Training%20Data.xlsx?dl=0

This is my file but real size is 1000 but I got 49998

I used standard code

Excel.Application xlApp = new Excel.Application();
Excel.Workbook xlWorkbook = xlApp.Workbooks.Open(txtbTrainPath.Text);
Excel.Worksheet xlWorksheet = xlWorkbook.Worksheets[1];
Excel.Range xlRange = xlWorksheet.UsedRange;              
xlApp.Visible = true;
xlWorksheet.Columns[5].Delete();
xlWorksheet.Columns[3].Delete();              
rowCount = xlRange.Rows.Count;
colCount = xlRange.Columns.Count;

Only for this file is not work correctly, other files works well. Please help me to find what is the problem. How to resize worksheet for only valid size.

ManishChristian
  • 3,759
  • 3
  • 22
  • 50
X zheng
  • 1,731
  • 1
  • 17
  • 25
  • It sounds like the used range is possibly returning cells that have either some formatting, or data that is not visible. Used range will still return cells that contain ANY formatting in the cell, even if the cell contains NO data. – JohnG Apr 11 '17 at 18:13

2 Answers2

21

I would use this approach to get the Rows and Columns count which will return the result of the cells which are not empty.

// Find the last real row
lastUsedRow = worksheet.Cells.Find("*",System.Reflection.Missing.Value, 
                               System.Reflection.Missing.Value, System.Reflection.Missing.Value, 
                               Excel.XlSearchOrder.xlByRows,Excel.XlSearchDirection.xlPrevious, 
                               false,System.Reflection.Missing.Value,System.Reflection.Missing.Value).Row;

// Find the last real column
lastUsedColumn = worksheet.Cells.Find("*", System.Reflection.Missing.Value, 
                               System.Reflection.Missing.Value,System.Reflection.Missing.Value, 
                               Excel.XlSearchOrder.xlByColumns,Excel.XlSearchDirection.xlPrevious, 
                               false,System.Reflection.Missing.Value,System.Reflection.Missing.Value).Column;

Here is the complete code for your reference:

using Excel = Microsoft.Office.Interop.Excel;

Excel.Application xlApp     = null;
Excel.Workbook wb           = null;
Excel.Worksheet worksheet   = null;
int lastUsedRow             = 0;
int lastUsedColumn          = 0;
string srcFile              = @"Path to your XLSX file";

xlApp = new Excel.ApplicationClass();
xlApp.Visible = false;
wb = xlApp.Workbooks.Open(srcFile,
                               0, false, 5, "", "", false, Excel.XlPlatform.xlWindows, "",
                               true, false, 0, true, false, false);

worksheet = (Excel.Worksheet)wb.Worksheets[1];
Excel.Range range

// Find the last real row
lastUsedRow = worksheet.Cells.Find("*",System.Reflection.Missing.Value, 
                               System.Reflection.Missing.Value, System.Reflection.Missing.Value, 
                               Excel.XlSearchOrder.xlByRows,Excel.XlSearchDirection.xlPrevious, 
                               false,System.Reflection.Missing.Value,System.Reflection.Missing.Value).Row;

// Find the last real column
lastUsedColumn = worksheet.Cells.Find("*", System.Reflection.Missing.Value, 
                               System.Reflection.Missing.Value,System.Reflection.Missing.Value, 
                               Excel.XlSearchOrder.xlByColumns,Excel.XlSearchDirection.xlPrevious, 
                               false,System.Reflection.Missing.Value,System.Reflection.Missing.Value).Column;

xlApp.Workbooks.Close();
xlApp.Quit();

Marshal.ReleaseComObject(worksheet);
Marshal.ReleaseComObject(wb);
Marshal.ReleaseComObject(xlApp);
ManishChristian
  • 3,759
  • 3
  • 22
  • 50
  • 1
    I have no idea why, but `lastUsedColumn` part threw an exception with a very specific sheet in my project. I had to fall back to `worksheet.UsedRange.Columns.Count`. However, after I used `worksheet.Cells.UnMerge()`, it started working. – Wiktor Stribiżew Feb 05 '21 at 16:47
1

After downloading the excel file and taking a look at it, I found as I expected, formatting on row 49998 column C. The font was different, the number formatting was different and the font color was red. I simply deleted all the cells on row 49998 and this simply decreased the number of used range rows to 49997. You could play this guess which cells have formatting by deleting the last row then check again. I have a feeling ALL cells above row 49998 have some kind of different formatting.

To fix this one sheet to get the proper used range. You have two choices. 1) Select the WHOLE row starting at row 1001, then scroll down to row 49998. With the SHIFT key pressed, click on the WHOLE row 49998. With rows 1001 to 49998 selected, right click on the selection and select "Delete" and possibly, if it asks, shift the cells up. Or 2) copy the cells with data only and paste the cells into an new worksheet, delete the old worksheet and rename the new worksheet to the previous worksheet name. I hope this makes sense.

JohnG
  • 9,259
  • 2
  • 20
  • 29
  • i tried to remove rows 1001 to 49998 but it cannot delete. how i have to delete rows? – X zheng Apr 12 '17 at 02:34
  • Right click on the selection and select "Delete"... or select "Delete" from the "Cells" ribbon of the "Home Tab" – JohnG Apr 12 '17 at 03:17
  • now i have tried this is programmily.. how i can select multirows in interop. and i am going to use Range.ClearFormats. clearly delete Range[1001:49998].delete or clearFormats. I am finding this... please help me – X zheng Apr 12 '17 at 03:52
  • Don't do it programmatically. Simply open the Excel file and delete the rows manually as it appears to only be this file that has the issue. If you want to delete all the empty rows in a worksheet programmatically, I answered this before however the answer is rather involved but will definitely get rid of all EMPTY rows/column regardless of formatting. [Remove Empty rows and Columns From Excel Files Faster using Interop](http://stackoverflow.com/questions/40574084/remove-empty-rows-and-columns-from-excel-files-faster-using-interop) – JohnG Apr 12 '17 at 03:59
  • thank you Mr. JohnG , Thank you very much for your help. :) – X zheng Apr 12 '17 at 04:06
  • Mr. John. Could you help me to select multi rows. now i used your method. it works correct but too much time running. more than 10 minutes how i can select multirows and clear at once. worksheet.Rows[rowIndex].Delete(); this for about 49899 rows has to much time.... – X zheng Apr 12 '17 at 04:25
  • You will have to show some code on how you are deleting these rows. Create a range from row 1001 to column 5 Row 49998, then, delete that range. If you were doing this programmatically, I would have to see the code to determine what could be taking so long. The code I linked to should have this deletion done in a matter of seconds. – JohnG Apr 12 '17 at 04:37
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/141492/discussion-between-riguang-zheng-and-johng). – X zheng Apr 12 '17 at 04:50