Fastest way to convert T[,] to T[][]?

Question

So it turns out all arrays are not created equal. Multi-dimensional arrays can have non-zero lower bounds. See for example Excel PIA's Range.Value property object[,] rectData = myRange.Value;

I need to convert these data into a jagged array. My first try below smells of complexity. Any suggestions to optimize it? It needs to handle the general case where lower bounds may not be zero.

I have this ex method:

    public static T[][] AsJagged<T>( this T[,] rect )
    {
        int row1 = rect.GetLowerBound(0);
        int rowN = rect.GetUpperBound(0);
        int col1 = rect.GetLowerBound(1);
        int colN = rect.GetUpperBound(1);

        int height = rowN - row1 + 1;
        int width = colN - col1 + 1;
        T[][] jagged = new T[height][];

        int k = 0;
        int l;
        for ( int i = row1; i < row1 + height; i++ )
        {
            l = 0;
            T[] temp = new T[width];
            for ( int j = col1; j < col1 + width; j++ )
                temp[l++] = rect[i, j];
            jagged[k++] = temp;
        }

        return jagged;
    }

Used like this:

    public void Foo()
    {
        int[,] iRect1 = { { 1, 1, 1, 1 }, { 1, 1, 1, 1 }, { 1, 1, 1, 1 }, { 1, 1, 1, 1 }, { 1, 1, 1, 1 }, { 1, 1, 1, 1 }, { 1, 1, 1, 1 }, { 1, 1, 1, 1 } };
        int[][] iJagged1 = iRect1.AsJagged();

        int[] lengths = { 3, 5 };
        int[] lowerBounds = { 7, 8 };
        int[,] iRect2 = (int[,])Array.CreateInstance(typeof(int), lengths, lowerBounds);
        int[][] iJagged2 = iRect2.AsJagged();

    }

Curious if Buffer.BlockCopy() would work or be faster?

Edit: AsJagged needs to handle reference types.

Edit: Found bug in AsJagged(). Added int l; and added col1 + width to inner loop.

Christian.K · Answer 1 · 2012-02-20T07:41:13.010

A view caveats/assumptions up front:

You seem to use only int as your data type (or at least seem to be OK with using Buffer.BlockCopy which would imply you can life with primitive types in general).
For the test data you show, I don't think there will be much different using any somewhat sane approach.

Having that said, the following implementation (which needs to be specialized for a specific primitive type (here int) because it uses fixed) is around 10 times faster than the approach using the inner loop:

    unsafe public static int[][] AsJagged2(int[,] rect)
    {
        int row1 = rect.GetLowerBound(0);
        int rowN = rect.GetUpperBound(0);
        int col1 = rect.GetLowerBound(1);
        int colN = rect.GetUpperBound(1);

        int height = rowN - row1 + 1;
        int width = colN - col1 + 1;
        int[][] jagged = new int[height][];

        int k = 0;
        for (int i = row1; i < row1 + height; i++)
        {
            int[] temp = new int[width];

            fixed (int *dest = temp, src = &rect[i, col1])
            {
                MoveMemory(dest, src, rowN * sizeof(int));
            }

            jagged[k++] = temp;
        }

        return jagged;
    }

    [DllImport("kernel32.dll", EntryPoint = "RtlMoveMemory")]
    unsafe internal static extern void MoveMemory(void* dest, void* src, int length);

Using the following "test code":

    static void Main(string[] args)
    {
        Random rand = new Random();
        int[,] data = new int[100,1000];
        for (int i = 0; i < data.GetLength(0); i++)
        {
            for (int j = 0; j < data.GetLength(1); j++)
            {
                data[i, j] = rand.Next(0, 1000);
            }
        }

        Stopwatch sw = Stopwatch.StartNew();

        for (int i = 0; i < 100; i++)
        {
            int[][] dataJagged = AsJagged(data);
        }

        Console.WriteLine("AsJagged:  " + sw.Elapsed);

        sw = Stopwatch.StartNew();

        for (int i = 0; i < 100; i++)
        {
            int[][] dataJagged2 = AsJagged2(data);
        }

        Console.WriteLine("AsJagged2: " + sw.Elapsed);
    }

Where AsJagged (the first case) is your original function, I get the following output:

AsJagged:  00:00:00.9504376
AsJagged2: 00:00:00.0860492

So there is indeed a faster way of doing it, however depending on the size of the test data, the number of times you actually perform this operation, and your willingness to allow unsafe and P/Invoke code, you're probably not going to need it.

Having that said, we were using large matrixes of double (say 7000x10000 elements) where it indeed did make a huge difference.

Update: about using Buffer.BlockCopy

I might overlook some Marshal or other trick, but I don't think using Buffer.BlockCopy is possible here. This is due to the fact that it requires both the source and destination array to, well, be an Array.

In our example, the destination is an array (e.g. int[] temp = ...) however the source is not. While we "know" that for two dimensional arrays of primitive types the layout is such, that each "row" (i.e. first dimension) is an array of the type in memory, there is no safe (as in unsafe) way to get that array without the overhead of copying it first. So we basically need to use a function that simply deals with memory and doesn't care about the actual content of it - like MoveMemory. BTW, the internal implementation of Buffer.BlockCopy does something similar.

Thanks for the answer, the use case in my question is limited. AsJagged() does need to handle reference types... How would that change your solution? (PS: I will update the question.) — dFlat, Feb 20 '12 at 07:30
I think you'd be better off not resorting to unsafe code, unless it's absolutely crucial. — zmbq, Feb 20 '12 at 07:32
@dFlag: my solution already only works for non-reference, or more precisely primitive types. If you need to support multiple different types, you would require overloads of the `AsJagged2` function for each type. However note that you should really measure given your anticipated requires (i.e. size of arrays) before dumping your approach. — Christian.K, Feb 20 '12 at 07:38
@zmbq Well, that is quite general advice, and thus never really wrong ;-) But seriously, I hope I made it somewhat clear in my answer that one really needs to measure (in his problem domain) before resorting to such measures (I "highlighted" the statement just in case). — Christian.K, Feb 20 '12 at 07:40

zmbq · Answer 2 · 2012-02-20T08:04:26.023

6

Your complexity is O(N*M) N - number of rows, M - number of columns. That's the best you can get when copying N*M values...

Buffer.BlockCopy might be faster than your inner for loop, but I wouldn't be surprised if the compiler knows how to handle this code properly and you won't gain any further speed. You should test it to make sure.

You may be able to achieve better performance by not copying the data at all (at the potential expense of slightly slower lookups). If you create an 'array row' class, that holds your rect and a row number, and provides an indexer that accesses the correct column, you can create an array of such rows, and save yourself the copying altogether.

The complexity of creating such an array of 'array rows' is O(N).

EDIT: An ArrayRow class, just because it bugs me...

The ArrayRow could look something like this:

class ArrayRow<T>
{
    private T[,] _source;
    private int _row;

    public ArrayRow(T[,] rect, int row)
    {
         _source = rect;
         _row = row;
    }

    public T this[int col] { get { return _source[_row, col]; } }
}

Now you create an array of ArrayRows, you don't copy anything at all, and the optimizer has a good chance of optimizing accessing an entire row in sequence.

edited Feb 20 '12 at 08:04

answered Feb 20 '12 at 06:45

zmbq

38,013
14
101
171

+1 for the mathematical thing. This is a non tribvial operation regardless how you turn it becasue you have to simply copy all the data per definition. The best you can come in is having this done with block methods, not manual item by item copy, but it IS a slow operation. Nothing in the world will change that. – TomTom Feb 20 '12 at 07:42
+1 also for O-national analysis. Note however, that the C# compiler actually optimizes quite less than is commonly assumed. Most optimizations are actually done, when jitting the IL to machine code. So unless you are good in your assembler (x86, x64, whatever) it is not too easy to actually proof what is done and what is not. – Christian.K Feb 20 '12 at 07:45
Guys, take a look at my last suggestion, it avoids copying the data altogether. – zmbq Feb 20 '12 at 08:03
@Christian, when I wrote 'compiler' I meant both the C# compiler and the jitter. It's the aggregate optimization that's interesting in this case. I believe the compiler+jitter combo will see a loop on all row elements as a traversal of one array, and check for boundries, for instance, only at the ends and not every access. – zmbq Feb 20 '12 at 08:11
@zmbq Not copying in the first place is certainly a good idea if you don't need to modify the "copy" distinct from the original (which the OP doesn't say a word about, so it is valid to assume that it is possible). – Christian.K Feb 20 '12 at 08:19
Well, you could implement a sort of copy-on-write. An ArrayRow gets detached from the original array only when you try to update it. That way you spare your copying to when you need it. This, of course, assumes the original array doesn't get modified along the way... – zmbq Feb 20 '12 at 08:22
2

@zmbq For a list of possible things the compiler does optimize see [here](http://blogs.msdn.com/b/ericlippert/archive/2009/06/11/what-does-the-optimize-switch-do.aspx). BTW, I'm not trying to counter your argument that optimization _might_ happen. I'm just saying that there are already too many (unproven) myths, about what would potentially be optimized (compiler or jitter), around. So without some proof or analysis attached, I'm always suspicious. No offense. ;-) – Christian.K Feb 20 '12 at 08:22
1

It's OK, suspicion is good for a programmer... I agree it needs to be tested. – zmbq Feb 20 '12 at 08:23

Fastest way to convert T[,] to T[][]?

2 Answers2

Linked