0

Summary

I have written an Excel wrapper in .NET using Visual Basic and Visual Studio 2008.

This question is the continuation of my quest! =P Please have an eye out these questions before you read any further.

  1. How to implement column self-naming from its index?;
  2. How to make this custom worksheet initialization faster?;
  3. What architecture to use to address this SystemOutOfMemoryException while allowing me to instantiate the cells of a sheet?.

I have also consulted other related questions on the topic and found these pretty interesting:

A. Excel Interop - Efficiency and performance;
B. Fastest way to write cells to Excel with Office Interop?.

Hypothesis

I now know that I won't be able, for performance sake, to walk my way as I expected it on the first draft.

Now, I have written these hypothesis to help me through:

  • Upon creation of a logical workbook, that is, a workbook that is not yet to be saved on the hard drive, I don't need to initialize my cells right away;

Since I'm don't need to initialize the cells, I could simply initialize my ReadOnlyDictionary(Of String, ICell) with null cells while building my column names and initialize them into the dictionary.

On the first access of a cell, I could simply check whether it is initialized, and initialize it when it's not.

  • When a workbook is created on disk, I can have no cells to initialize, that is, I create the file on disk, and return the instance of the IWorkbook interface instance created.

In fact, this workbook just got created and has no values in it yet. Hence, the partial initialization of the dictionary of cells might once again be the way to go.

Aside, let's say I have to set values to cells, I know that it will get faster if I simply use a two dimensional array which contains the values that I have to set the cells to.

In both first scenarios, the use of an object array outside the ISheet.Cells dictionary is interesting to consider for performance, but seems odd when speaking object-oriented programming, when we know that a cell shall be aware of its own existence. That is what I intended to do while using dependency injection between the Range interop interface and the Cell class.

In such approach, how to arrimate the native cells with a cell value?

  • It is upon the opening of a workbook that I have to initialize the Sheet.Cells with the Ranges that has values.
  • A used cell contains a non-null value;
  • The used cells are returned under two possible ways:
    i. An instance of the Range interop interface; ii. Values of the used Range can be obtained within a two dimensional object array.

Architecture and behaviour thoughts

I thought of writing myself a CellCollection or the like to be initialized with the two dimensional array obtained from the (object[,])NativeSheet.UsedRange.get_Value(XlRangeValueDataType.xlRangeValueDefault) and then have a property where I could make the call to a cell with its address ("A1"), which would then get the respective value from the object[,] and return it. When it would come to format a given, I could simply call the appropriate Range given either one or multiple cells.

This thought is yet to be precised and I'm still a bit confused while explaning it. Should you have any question about it, please feel free to ask so that I might myself get things leaner and swifter.

Questions

  • What is the best way to optimize this architecture while keeping it "clean" in regards to its object-oriented design?

  • Are there other alternatives which will allow me to keep the cell addressing feature available while investing into performance?

Thanks to everyone for your kind and valuable help!

Community
  • 1
  • 1
Will Marcouiller
  • 23,773
  • 22
  • 96
  • 162

1 Answers1

0

Please see answer to this question for the answer:
What architecture to use to address this SystemOutOfMemoryException while allowing me to instantiate the cells of a sheet?

Community
  • 1
  • 1
Will Marcouiller
  • 23,773
  • 22
  • 96
  • 162