0

Around 15 years ago, I was writing .xll addins for Excel using the C-API, mostly for spreadsheet formulae (complex and time-consuming calcs), and written in C++ (very much my preferred language). I subsequently did a few things with C# wrappers.

After a long break, during which Excel and Visual Studio have moved on apace, I am once again looking at building an addin to use compiled code for a lengthy optimization, rather than using a VBA script. The optimization involves taking large arrays of data out of worksheets, crunching then and then writing the results back into the workbook (as I want to chart them etc later).

NB. This is for my own use, I am not planning on distributing any solution

In pseudo Excel VBA code what I'd like to have is something like this:

Dim obj as MyOptimizer
Dim rngInputData as Range

obj.SetData(rngInputData)
obj.Optimize

Dim v as vOutputData
v = obj.GetResult

I am a little bewildered by the options available (in no particular order):

  • code a C-style dll with a VBA xla wrapper
  • code a C-style dll with an Excel.DNA C# wrapper
  • code a COM object of some sort (though I am not sure with Visual Studio template to use)
  • Something else ...

A COM object has its attractions as it will hold state, and I can create an instance of my object without having to write a clunky handle-passing system. The other consideration is the amount of marshalling between different layers (I'll be using a lot of 2-D arrays, though only of doubles), and the type-checking of the inputs.

I'm looking for some guidance at this crossroads, before I go down the rabbit hole of one particular method ... then of course, I'll be back with more questions!

Thanks!

DS_London
  • 3,644
  • 1
  • 7
  • 24

1 Answers1

1

If C++ is still very much your preferred language, you should certainly consider that. It will give the the best performance possible, and leverage your knowledge of the C API. If you can afford it, I would also recommend you have a look at the XLL+ library. While expensive, I think if you look at the features you'll find it brings a lot of value.

I develop Excel-DNA and very much prefer to avoid C++ in favour of the .NET platform. So, while biased, I think it will be an even better fit for your requirements. It is a vastly easier environment to work in compared to C++ (in my opinion - no XLOPERs!) However, there is some overhead in the marshaling and so you'd have some compromise on the performance. You don't indicate what range sizes or how fast you expect the interchange to happen. With Excel-DNA, I expect you can easily read, process and write a block of a million numbers in about a second (see my answer here: https://stackoverflow.com/a/3868370/44264). Similarly, a small block of 10 cells with numbers can be read and written more than 10,000 times in a second. Things will get a bit slower if you are working with long strings rather than numbers, but you suggest this is not the case.

While you should expect calculation code (apart from the Excel interop) to be very fast with C# and .NET, you can also talk to C libraries from .NET very efficiently. So having core calculation libraries in C and the use the P/Invoke mechanism to interact to them from .NET is a completely viable plan, and you still get the (large) benefit of a more pleasant environment for the Excel parts.

VBA and COM approaches to reading / writing with Excel will not perform quite as well as the Excel-DNA / .NET approach (which also uses the C API for this kind of thing under the hood, rather than COM). Still, when used properly the COM overhead is not terrible, and might not on its own be a showstopper for you.

I am interested in this kind of optimization approach myself, so would be happy to help if you do take the Excel-DNA approach. The best place for Excel-DNA questions is the Excel-DNA Google group.

Getting started with Excel-DNA would look like this:

  • Download and install the (free) Visual Studio 2019 Community Edition.
  • Select the Desktop .NET Development workload when installing. You don't need to check the Office options when installing for Excel-DNA add-ins.
  • Then make a new C# "Class Library (.NET Framework)" project. It's important at this step not to pick ".NET Standard" or ".NET Core" (long story...).
  • Then in your project install the "ExcelDna.AddIn" package from NuGet.
  • Read and follow the instructions in the readme that pops up.
  • Paste in the code snippet from here: https://stackoverflow.com/a/3868370/44264 press F5 and test in Excel.

After the slow Visual Studio install, it will only take a few minutes, and you'll get some idea of what's involved.

Govert
  • 16,387
  • 4
  • 60
  • 70
  • Many thanks Govert for a full & complete answer, and the advice. To be more specific my optimisation has inputs of several arrays, each of which will have around a million items. H – DS_London Jul 19 '20 at 12:26
  • (Sorry comment got cut off). The ability to pass objects across the COM interface appeals to me, with the built in reference counting. Using CComSafeArray to unpack the passed arrays is quite painless. If I were looking to write spreadsheet functions, I'd go the Excel.DNA route (as the marshalling speed is more important). – DS_London Jul 20 '20 at 15:22