How is Array.Copy implemented in C#?

Question

I tried to look at the implementation of Array.Copy in C# with ILSpy but it didn't show me the implementation itself.

I wrote a simple benchmark, Array.Copy vs a simple for loop to copy the data. Array.Copy was faster.

How is it implemented faster?

Thanks, Shay

Probably because, since arrays are contiguous in memory, the CLR can calculate the size of memory that needs to be copied and then copy it all at once, rather than one object at a time. — Platinum Azure, Jul 02 '11 at 16:38
Is that possible to implement that in C# or only in assembly or c/c++? — Shay Ben Moshe, Jul 02 '11 at 16:40
@shayfalador: It depends on the JIT compiler. For example, [Mono lets you access SIMD instructions from C#](http://www.google.com/search?q=mono+simd). Microsoft's .NET doesn't give you that level of control, but might use SIMD instructions as a result of optimization. — Ben Voigt, Jul 02 '11 at 16:53

score 33 · Answer 1 · answered Jul 02 '11 at 17:28

Disassembling the Array class will land you on this declaration:

[MethodImpl(MethodImplOptions.InternalCall), SecurityCritical, ReliabilityContract(Consistency.MayCorruptInstance, Cer.MayFail)]
internal static extern void Copy(Array sourceArray, int sourceIndex, Array destinationArray, int destinationIndex, int length, bool reliable);

The [MethodImpl] attribute tells the JIT compiler that the method is actually implemented in the CLR, written in C++ instead of a managed language. It looks in a table of method names and retrieves a pointer to the C++ function that implements the method and compiles it to a simple CALL instruction.

Getting the source code for the CLR is a bit tricky, but the SSCLI20 version is pretty accurate for methods that have been around for a long time and didn't require tweaking. Array.Copy() certainly qualifies. The table I mentioned is defined in clr\src\vm\ecall.cpp, the section that's relevant to your question looks like this:

FCFuncStart(gArrayFuncs)
    FCFuncElement("Copy", SystemNative::ArrayCopy)
    FCFuncElement("Clear", SystemNative::ArrayClear)
    FCFuncElement("get_Rank", Array_Rank)
    //  etc...

The SystemNative::ArrayCopy() function pointer takes you to clr\src\vm\comsystem.cpp. The actual function is too big to copy here without making your eyes glaze over, there is a lot of error checking going on. It looks for a way to optimize the copy, the happy case is where the elements of the array can simply be copied without being converted. That is done by a function named m_memmove(). You'll find that function in the same file, it is used in the 32-bit version of the CLR.

Which first copies a single byte at a time until the destination address is aligned on a multiple of 4 bytes. Then it copies 16 bytes at a time, 4 times 4, these copies are fast because they are aligned. Then it copies what's left one byte at a time.

You can perhaps now see why it can be faster than your own loop. It can move 4 bytes at a time even if the array element size is not 4 bytes wide. And it can do so while ensuring the copy address is aligned, you can't since the physical address of the array element isn't discoverable.

score 11 · Accepted Answer · edited May 23 '17 at 11:46

11

Same techniques used to write a fast memcpy function:

loop unrolling
transfer of aligned data in large chunks (often using SIMD)
CPU caching hints (SIMD helps here as well)

How is Array.Copy implemented in C#?

2 Answers2

Linked