1

Fastest way to interface between live (unsaved) Excel data and C# objects features a very knowledgeable discussion about how to access Excel data efficiently. According to the answers two recommendations are to use Excel-Dna or Add-in Express.

I believe Excel-Dna and Add-in Express both require to run as an Excel add-in. In my case the business logic runs in another process and creates an Excel instance (New Excel.Application()). On changes in Excel it needs to update internal objects.

Getting the cell values from Excel can efficiently be done with DirectCast(Range.Value, Object(,)). Unfortunately, this is not possible for other properties like Range.Text or Range.NumberFormat. As expected, iterating the through cells is very slow.

What do you think is the most efficient way to access the Excel data from the main process? One idea I came up with is to add an add-in to Excel (using say Excel-Dna) which aggregates the data and then posts it to the main application. As I do not have much experience with cross-AppDomain marshalling I would appreciate an assessment if this is a viable approach or another one should be preferred.

Community
  • 1
  • 1
Paul B.
  • 2,394
  • 27
  • 47
  • Its usually not a good idea to get Range.Text values because you can lose precision or get ### instead of a number if the column width is too small. See http://fastexcel.wordpress.com/2011/11/30/text-vs-value-vs-value2-slow-text-and-how-to-avoid-it/ for a comprison of .Text .Value and .Value2 – Charles Williams Jan 11 '12 at 13:16
  • Agreed, it's better to use .Value(2) and .NumberFormat. So the question remains how to efficiently query .NumberFormat (plus additional formatting properties I'd like to evaluate in a second step). – Paul B. Jan 11 '12 at 15:10

2 Answers2

0

Getting the number format from a range of cells one-by-one is going to be slow with any technology (and slower with Interop) and there is no way that I know of to get a corresponding array of different number formats with a single object model call.
If your excel data is in clusters of formats you may be able to exploit the fact that Range("C1:C4").Numberformat will get a single number format if all the cells in the range have the same number format but if any of the number formats within the range differ you will get a Null

Charles Williams
  • 23,121
  • 5
  • 38
  • 38
  • Thanks for your thoughts. As I think there is no way to determine a range that has a common format it won't be possible to use NumberFormat on a range (and get Null if it's not consistent) because I have no information about the formats. – Paul B. Jan 11 '12 at 17:35
  • OK: Curiosity question - why do you need formatted data anyway- its not the real data and loses precision etc compared to the real values? – Charles Williams Jan 12 '12 at 00:39
  • It's for displaying purposes: I would like the user to be able to set a switch like "Formats as in Excel" so he/she does not need to worry about number formats twice. The real data from .Value/Value2 will be stored and used in conjunction with the format to generate the output. – Paul B. Jan 12 '12 at 07:05
0

Maybe you can read the file using one of the direct file readers, instead of going through Excel. (If need be you can automate Excel to save the file, or a copy of part of the file.)

As a start I suggest you look at ClosedXML, but there are many similar projects. ClosedXML look like an active project and tries to mirror the COM interface. [I have no experience with it myself.]

Govert
  • 16,387
  • 4
  • 60
  • 70
  • Hi Govert, Thanks for your input. If you are saying so I am now 100% sure that Excel-Dna was not made for this task :). Using XML data is an interesting workaround, I will look into this approach. The XML data is only accessible if saving as XML file or via the clipboard, right? – Paul B. Jan 11 '12 at 17:39
  • The xml files are the default file format since Excel 2007 - all the .xlsx files are just zip files with all the content in xml. So it is the normal Excel 2007/2010 files that are processed by ClosedXML. I don't think it would work through the Clipboard though. If you do it the other way - have some add-in running inside Excel and make your own cross-process data transfer, Excel-DNA would be the ideal tool for the add-in, since you can read the data quickly with the C API. – Govert Jan 12 '12 at 14:06
  • Actually, I was planning to use the [Excel 2003 XML Spreadsheet format](http://msdn.microsoft.com/en-us/library/aa140066%28v=office.10%29.aspx) since I hope it involves less overhead when writing and reading. The data can also be retrieved from the clipboard (format "XML Spreadsheet") but I know it violates UI principles to highjack the clipboard. – Paul B. Jan 12 '12 at 15:38