35

I have a two dimensional array and I need to convert it to a List (same object). I don't want to do it with for or foreach loop that will take each element and add it to the List. Is there some other way to do it?

Micha Wiedenmann
  • 19,979
  • 21
  • 92
  • 137
Yanshof
  • 9,659
  • 21
  • 95
  • 195

3 Answers3

63

Well, you can make it use a "blit" sort of copy, although it does mean making an extra copy :(

double[] tmp = new double[array.GetLength(0) * array.GetLength(1)];    
Buffer.BlockCopy(array, 0, tmp, 0, tmp.Length * sizeof(double));
List<double> list = new List<double>(tmp);

If you're happy with a single-dimensional array of course, just ignore the last line :)

Buffer.BlockCopy is implemented as a native method which I'd expect to use extremely efficient copying after validation. The List<T> constructor which accepts an IEnumerable<T> is optimized for the case where it implements IList<T>, as double[] does. It will create a backing array of the right size, and ask it to copy itself into that array. Hopefully that will use Buffer.BlockCopy or something similar too.

Here's a quick benchmark of the three approaches (for loop, Cast<double>().ToList(), and Buffer.BlockCopy):

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;

class Program
{
    static void Main(string[] args)
    {
        double[,] source = new double[1000, 1000];
        int iterations = 1000;

        Stopwatch sw = Stopwatch.StartNew();
        for (int i = 0; i < iterations; i++)
        {
            UsingCast(source);
        }
        sw.Stop();
        Console.WriteLine("LINQ: {0}", sw.ElapsedMilliseconds);

        GC.Collect();
        GC.WaitForPendingFinalizers();

        sw = Stopwatch.StartNew();
        for (int i = 0; i < iterations; i++)
        {
            UsingForLoop(source);
        }
        sw.Stop();
        Console.WriteLine("For loop: {0}", sw.ElapsedMilliseconds);

        GC.Collect();
        GC.WaitForPendingFinalizers();

        sw = Stopwatch.StartNew();
        for (int i = 0; i < iterations; i++)
        {
            UsingBlockCopy(source);
        }
        sw.Stop();
        Console.WriteLine("Block copy: {0}", sw.ElapsedMilliseconds);
    }


    static List<double> UsingCast(double[,] array)
    {
        return array.Cast<double>().ToList();
    }

    static List<double> UsingForLoop(double[,] array)
    {
        int width = array.GetLength(0);
        int height = array.GetLength(1);
        List<double> ret = new List<double>(width * height);
        for (int i = 0; i < width; i++)
        {
            for (int j = 0; j < height; j++)
            {
                ret.Add(array[i, j]);
            }
        }
        return ret;
    }

    static List<double> UsingBlockCopy(double[,] array)
    {
        double[] tmp = new double[array.GetLength(0) * array.GetLength(1)];    
        Buffer.BlockCopy(array, 0, tmp, 0, tmp.Length * sizeof(double));
        List<double> list = new List<double>(tmp);
        return list;
    }
}

Results (times in milliseconds);

LINQ: 253463
For loop: 9563
Block copy: 8697

EDIT: Having changed the for loop to call array.GetLength() on each iteration, the for loop and the block copy take around the same time.

Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
  • 2
    The main problem with that one is that it can leave a big temporary array on the large object heap. – CodesInChaos Feb 27 '11 at 09:51
  • 1
    @CodeInChaos: Absolutely. It's a pain we can't tell `List` to just use the given array :( I think it's still likely to be faster than looping though. – Jon Skeet Feb 27 '11 at 09:57
  • 1
    The problem with telling `List` to use a certain array is that we could tell several lists to use the same array. Not sure how big a problem that's be in practice. – CodesInChaos Feb 27 '11 at 10:02
  • @CodeInChaos: Yup, that's why there's no way of doing it. It's probably the right decision on the part of the BCL team - it's just irritating for things like this :) – Jon Skeet Feb 27 '11 at 10:10
  • One interesting observation on the looping solution is that it's twice as slow if one swaps the inner and outer loop. Most likely due to CPU caches working better if you read/write sequentially. – CodesInChaos Feb 27 '11 at 11:05
  • @CodeInChaos: Yes, that's a fairly well-known phenomenon. I don't think the reason is so much related to CPU caches as it is the physical location of the data in memory. You have a 2D array indexed in row-major order, it's much faster to iterate through the array sequentially than it is to jump around. Read more about it [here](http://stackoverflow.com/questions/405810/net-rectangular-arrays-how-to-access-in-a-loop) and [here](http://stackoverflow.com/questions/997212/fastest-way-to-loop-through-a-2d-array). – Cody Gray - on strike Feb 27 '11 at 11:48
  • Iterating sequentially is faster because when transferring from the RAM to the cache several sequential entries are fetched at the same time. I'd guess that if the size of your array entries is a multiple of the cache-line size the advantage of serial access disappears. – CodesInChaos Feb 27 '11 at 12:12
  • @JonSkeet: Is there a way in which you can alter the block copy so that is only copies a specific range from the source array. For instance if I wanted all of the data between indexes (10,20) through (20, 20)? – LamdaComplex Dec 09 '11 at 19:31
  • @LamdaComplex: I don't believe so - because that isn't copying a *block* of data - that's copying 11 separate values. – Jon Skeet Dec 09 '11 at 19:32
  • @JonSkeet: I was really thinking this would help me. I'm actually tackling a slightly different problem involving 3D arrays and grabbing a volume of data from it and flattening it into a 1D array as fast as possible. http://stackoverflow.com/questions/8448635/flatten-a-volume-of-a-3d-array-into-a-1d-array-of-objects/8448952#8448952 – LamdaComplex Dec 09 '11 at 19:35
42

To convert double[,] to List<double>, if you are looking for a one-liner, here goes

double[,] d = new double[,]
{
    {1.0, 2.0},
    {11.0, 22.0},
    {111.0, 222.0},
    {1111.0, 2222.0},
    {11111.0, 22222.0}
};
List<double> lst = d.Cast<double>().ToList();


But, if you are looking for something efficient, I'd rather say you don't use this code.
Please follow either of the two answers mentioned below. Both are implementing much much better techniques.
AbdelAziz AbdelLatef
  • 3,650
  • 6
  • 24
  • 52
naveen
  • 53,448
  • 46
  • 161
  • 251
  • 5
    Aside from everything else, that will end up boxing every `double` in the array... it'll perform poorly. – Jon Skeet Feb 27 '11 at 09:48
  • I think OP accepts this answer because he wants a "clearer"(I mean easy for him to implement and understand) way, not the real fastest way. – Cheng Chen Feb 27 '11 at 09:52
  • 1
    @Danny: I'm not really sure how this method is any clearer or easier to understand than a `for` loop, which the OP explicitly wishes to avoid. Not to mention the title says "Fast". – Cody Gray - on strike Feb 27 '11 at 09:53
  • 10
    In my quick benchmark of a 1000 x 1000 array, this performs over *30 times* as slowly as the for loop or the Buffer.BlockCopy solution. I'm pretty surprised it's been accepted, given the "Fast" part of the title. – Jon Skeet Feb 27 '11 at 09:59
  • (A longer test changed the timings slightly, but it's still easily an order of magnitude slower.) – Jon Skeet Feb 27 '11 at 10:09
  • 2
    @downvoters: thanks for letting me know that I know less than JonSkeet. :) Please understand that I am not deleting the answer, because OP used this code somewhere and is happy with it. Tell me, how many of you downvoters work at enterprise level? funny – naveen Sep 05 '13 at 15:24
  • @naveen `ToList` is certainly the approach I'd use most of the time since it's usually fast enough. I assume the downvoters prefer other solutions because the question explicitly asks for a *fast way* and your solution is relatively slow. – CodesInChaos Sep 05 '13 at 17:23
  • @CodesInChaos: fast way in India means, finish the code fast. Coding is a highly profitable thing here compared to other jobs. I still believe the guy wanted a solution he could implement fast, not execute fast. – naveen Sep 05 '13 at 18:32
  • 2
    @naveen: I see no real evidence of that, and given that the OP can use any of the answers by just copying and pasting them and adjusting to his variable names, they're all equally "fast" by that definition. Even if you think that's the most likely intention, your answer provides no indication of the inefficiency involved which would be appropriate in order to serve *all* readers rather than just the original poster. – Jon Skeet Sep 05 '13 at 19:09
  • @JonSkeet: i realize that. the downvote irks me a bit. thats all :) – naveen Sep 06 '13 at 05:51
  • Well maybe you should improve your answer then? Even just explaining that it *is* slow (and why) would be better. Your answer does not help to answer the question which apparently many people believe was asked (myself included), so those downvotes are reasonable. Look at it this way: you're still massively rep-positive for an answer which doesn't provide an efficient solution. – Jon Skeet Sep 06 '13 at 05:54
11

A for loop is the fastest way.

You may be able to do it with LINQ, but that will be slower. And while you don't write a loop yourself, under the hood there is still a loop.

  • For a jagged array you can probably do something like arr.SelectMany(x=>x).ToList().
  • On T[,] you can simply do arr.ToList() since the IEnumerable<T> of T[,] returns all elements in the 2D array. Looks like the 2D array only implements IEnumerable but not IEnumerable<T> so you need to insert a Cast<double> like yetanothercoder suggested. That will make it even slower due to boxing.

The only thing that can make the code faster than the naive loop is calculating the number of elements and constructing the List with the correct capacity, so it doesn't need to grow.
If your array is rectangular you can obtain the size as width*height, with jagged arrays it can be harder.

int width=1000;
int height=3000;
double[,] arr=new double[width,height];
List<double> list=new List<double>(width*height);
int size1=arr.GetLength(1);
int size0=arr.GetLength(0);
for(int i=0;i<size0;i++)
{  
  for(int j=0;j<size1;j++)
    list.Add(arr[i,j]);
}

In theory it might be possible to use private reflection and unsafe code to make it a bit faster doing a raw memory copy. But I strongly advice against that.

CodesInChaos
  • 106,488
  • 23
  • 218
  • 262
  • 1
    Could you give a sample of what you're thinking about in the `for` loop so I can benchmark it against my `Buffer.BlockCopy` approach? I'd *expect* mine to be faster, but I want to make sure I'm testing the right thing... – Jon Skeet Feb 27 '11 at 09:47
  • I think should be arr.Cast().ToList()? – Cheng Chen Feb 27 '11 at 09:49
  • Looks like yours if about twice as fast @Jon – CodesInChaos Feb 27 '11 at 09:57
  • @Danny That's why I edited it while you were writing your comment. – CodesInChaos Feb 27 '11 at 09:58
  • 1
    @CodeInChaos: You can optimize yours somewhat by not calling `GetLength` on every iteration... but in my tests, Buffer.BlockCopy is still a bit faster. – Jon Skeet Feb 27 '11 at 10:00
  • Interesting what the jitter does understand, and what it doesn't. In one way it seems to know that the dimensions of an array won't change(since it eliminates the bounds checks, if I use height and width it's much slower) but still doesn't understand that it can remove later calls to get the size. – CodesInChaos Feb 27 '11 at 10:08
  • @CodeInChaos: Interesting - I knew the JIT optimized around using the vector .Length property, but I didn't realise it would optimize around .GetLength() as well. I've edited my answer to reflect that too - our solutions end up being around the same then. – Jon Skeet Feb 27 '11 at 12:00