3

As my goal is to out perform the List<T> i am testing arrays and found few starting points to get on testing i have tested this before trying to capture bitmaps off screen, and tests proved the usage is suffice. my question is what data types could use this Copy() code except for byte[]

say i want a data storage unit to take the advantage of unmanaged / unsafe

        public unsafe struct NusT
        {
            public unsafe int vi;
            public unsafe bool vb;
        }

instead of populating a list

i initialise the struct as follows : 1)

NusT n; 
n.vi= 90;
n.vb=true

i have tested this after testing the folowing: 2)

NusT n = new NusT(){vi=90, vb=true};

this test was after testing :3)

NusT n = new NusT("90", true);

i think both last had same results but the first one is blazing fast, as i do not create an object so

NusT n-> instructions- 1
n.vi=90 -> instructions- 1
n.vb=true -> instructions- 1 

now i minimized what i could and this started at the begining with a class: whitch was even worse than 2 & 3 above as it also uses properties

class bigAndSlow
{
     public int a { get; private set;}
     public bool b { get; private set;}
     public string  c { get; private set;}

     public bigAndSlow(int .. ,boo .. , string.. )
     {
        initialise ...
     }
}

so now when the final decision is

        public unsafe struct NusT
        {
            public unsafe int vi;
            public unsafe bool vb;
        }

how can i implement this blazingly fast data unit to use Copy() on

NusT[] NustyArr;


    static unsafe void Copy(byte[] src, int srcIndex,
            byte[] dst, int dstIndex, int count)
        {
            if (src == null || srcIndex < 0 ||
                dst == null || dstIndex < 0 || count < 0)
            {
                throw new ArgumentException();
            }
            int srcLen = src.Length;
            int dstLen = dst.Length;
            if (srcLen - srcIndex < count ||
                dstLen - dstIndex < count)
            {
                throw new ArgumentException();
            }


            // The following fixed statement pins the location of
            // the src and dst objects in memory so that they will
            // not be moved by garbage collection.          
            fixed (byte* pSrc = src, pDst = dst)
            {
                byte* ps = pSrc;
                byte* pd = pDst;

                // Loop over the count in blocks of 4 bytes, copying an
                // integer (4 bytes) at a time:
                for (int n = 0; n < count / 4; n++)
                {
                    *((int*)pd) = *((int*)ps);
                    pd += 4;
                    ps += 4;
                }

                // Complete the copy by moving any bytes that weren't
                // moved in blocks of 4:
                for (int n = 0; n < count % 4; n++)
                {
                    *pd = *ps;
                    pd++;
                    ps++;
                }
            }
        }


        static void Main(string[] args)
        {
            byte[] a = new byte[100];
            byte[] b = new byte[100];
            for (int i = 0; i < 100; ++i)
                a[i] = (byte)i;
            Copy(a, 0, b, 0, 100);
            Console.WriteLine("The first 10 elements are:");
            for (int i = 0; i < 10; ++i)
                Console.Write(b[i] + " ");
            Console.WriteLine("\n");
        }
sujith karivelil
  • 28,671
  • 6
  • 55
  • 88
LoneXcoder
  • 2,121
  • 6
  • 38
  • 76

1 Answers1

2

Yes, you can do this with any blittable type. The blittable types are primitive types (integer and float types, but not bool), one-dimensional arrays of blittable types and structures containing fields of blittable types only.

The structure NusT is not blittable because it contains bool field. Just change it to byte and you will get a blittable structure for which you can obtain a pointer.

Here is the code that works for any type:

static unsafe void UnsafeCopy<T>(T[] src, int srcIndex, T[] dst, int dstIndex, int count) where T : struct
{
    if (src == null || srcIndex < 0 || dst == null || dstIndex < 0 || count < 0 || srcIndex + count > src.Length || dstIndex + count > dst.Length)
    {
        throw new ArgumentException();
    }

    int elem_size = Marshal.SizeOf(typeof(T));

    GCHandle gch1 = GCHandle.Alloc(src, GCHandleType.Pinned);
    GCHandle gch2 = GCHandle.Alloc(dst, GCHandleType.Pinned);

    byte* ps = (byte*)gch1.AddrOfPinnedObject().ToPointer() + srcIndex * elem_size;
    byte* pd = (byte*)gch2.AddrOfPinnedObject().ToPointer() + dstIndex * elem_size;
    int len = count * elem_size;

    try
    {
        // Loop over the count in blocks of 4 bytes, copying an
        // integer (4 bytes) at a time:
        for (int n = 0; n < len / 4; n++)
        {
            *((int*)pd) = *((int*)ps);
            pd += 4;
            ps += 4;
        }

        // Complete the copy by moving any bytes that weren't
        // moved in blocks of 4:
        for (int n = 0; n < len % 4; n++)
        {
            *pd = *ps;
            pd++;
            ps++;
        }
    }
    finally
    {
        gch1.Free();
        gch2.Free();
    }
}

But I strongly advice you to use Array.Copy. It is already the most efficient way to copy arrays. See the benchmarks of copying array of 1M elements below:

byte[] Array.Copy: 57,491 us

byte[] FastCopy: 138,198 us

byte[] JustCopy: 792,399 us

byte[] UnsafeCopy: 138,575 us

byte[] MemCpy: 57,667 us

NusT[] Array.Copy: 1,197 ms

NusT[] JustCopy: 1,843 ms

NusT[] UnsafeCopy: 1,550 ms

NusT[] MemCpy: 1,208 ms

FastCopy is your copy function, UnsafeCopy is my templated function, JustCopy is a simple implementation for (int i = 0; i < src.Length; i++) dst[i] = src[i];. MemCpy is PInvoke call of msvcrt memcpy function.

The verdict is: using pointers in C# for performance improvement is a bad practice. JIT does not optimize the unsafe code. The best solution is to move performance critical code to native DLLs.

Community
  • 1
  • 1
Andrey Nasonov
  • 2,619
  • 12
  • 26
  • "It is already the most efficient way to copy arrays" That's not quite certain I think. Depends on the array size at the very least. Maybe your copy was not optimized well or ran under Debug mode. Another possibility is to call a native `memcpy`. – usr Oct 04 '15 at 20:09
  • It was release mode w/o debugger. Added results for memcpy. – Andrey Nasonov Oct 04 '15 at 20:18
  • OK, Array.Copy seems to be memcpy plus startup overhead. For small arrays the method with the lowest startup costs will win. For big arrays you're right. Anyway, there's no way to know if this is even what the OP is concerned about. He needs to react. – usr Oct 04 '15 at 20:23
  • @usr My point as described at the start of this topic is, to try and beat the performance of list , as u can see from the very kind @ Andrey's answer, he had tested and came to result as same with my collection argument with my self ,same goes to manipulating it, eg substract add, query etc'. i was thinking there must have been some steps taken by managed and built-in methods /classes ... that could possibly be shaved.. – LoneXcoder Oct 04 '15 at 20:30
  • For small arrays (about 10 elements) the advantage of `Array.Copy` is even more evident. Unsafe methods (memcpy, c# pointers) need to fix the object in memory while `Array.Copy` does not need this. – Andrey Nasonov Oct 04 '15 at 20:30
  • @LoneXcoder, you can improve the performance of `List` by optimizing the code of `List` for your needs. For example, the method `AddRange` http://referencesource.microsoft.com/#mscorlib/system/collections/generic/list.cs,79de3e39e69a4811 can be optimized by eliminating the creation of temporary array `itemsToInsert` when unserting an array or another list. – Andrey Nasonov Oct 04 '15 at 20:36
  • Fixing arrays is free if the type is statically known. `fixed` is next to free. This is only slow because of the GC handle. Array.Copy is not always superior, for example in the 0-4 element range I bet that a safe managed copy loop is the fastest. – usr Oct 04 '15 at 20:41
  • @AndreyNasonov, first thanks alot for your broad answer. as well as the tests. i have reviewed the code and thinking , what if i did stand to the BLT family , can i pass to the NusT struct converted to BLT-ready data and add fixed to the values or some other similar logic/approach – LoneXcoder Oct 04 '15 at 20:44
  • @usr my english is not the best. at the time you mentioned fixed i was thinking the exact same . it's just took me the time to write it after reviewing the code see if we can eliminate something – LoneXcoder Oct 04 '15 at 20:48
  • Yes, the managed loop was the fastest for very small arrays (<20 elements). So `Array.Copy` has a large constant overhead but the optimal memory copy algorithm. – Andrey Nasonov Oct 04 '15 at 20:51
  • @AndreyNasonov hey. andrey as i am still researching and the point is not only copying (stage where it needs to 'grow') there's also other manipulations such as pop, another simple task...though string manipulation in a collection which take place in unmanaged state surly would gain some extra hp.. computation such as this >http://stackoverflow.com/questions/2748749/fast-algorithm-implementation-to-sort-very-small-list – LoneXcoder Oct 07 '15 at 00:59