5

The MSDN documentation for cpblk is a bit sparse:

The cpblk instruction copies a number (type unsigned int32) of bytes from a source address (of type *, native int, or &) to a destination address (of type *, native int, or &). The behavior of cpblk is unspecified if the source and destination areas overlap.

cpblk assumes that both the source and destination addressed are aligned to the natural size of the machine. The cpblk instruction can be immediately preceded by theunaligned. instruction to indicate that either the source or the destination is unaligned.

Ok, compared to other bulk copy operations such as Array.Copy, Marshal.Copy, and Buffer.BlockCopy, we know that:

  • The size is measured in bytes
  • The pointers should be aligned

This leaves me with some questions:

  • Should the buffers be pinned first? Does it matter whether the operand type is native int, "unmanaged pointer" or "managed pointer (&)"?
  • Are there restrictions on the type? (for example, Buffer.BlockCopy only works on primitive types, not structures even if they contain only primitive types)

According to https://stackoverflow.com/a/26380105/103167 pinning is unnecessary, but the supporting explanation is just wrong. (I suspect it is an overgeneralization from the fact that the Large Object Heap isn't compacted)

ECMA-335 isn't very helpful either. The instruction description there contains the same verbiage and adds

[Rationale: cpblk is intended for copying structures (rather than arbitrary byte-runs). All such structures, allocated by the CLI, are naturally aligned for the current platform. Therefore, there is no need for the compiler that generates cpblk instructions to be aware of whether the code will eventually execute on a 32-bit or 64-bit platform. end rationale]

Ok, this sounds like it should accept more types than Buffer.BlockCopy. But still not arbitrary types.

Perhaps the newly released .NET core source code will hold some answers.

Community
  • 1
  • 1
Ben Voigt
  • 277,958
  • 43
  • 419
  • 720

1 Answers1

10

cpblk and its companion, initblk, map directly to the intrinsics that any native language compiler depends on to initialize and copy structures. No need to wait for .NETCore source, you can see their semantics from SSCLI20, clr/src/fjit/fjitdef.h. A simple jitter, it converts cpblk directly to a call to memcpy(), initblk to memset(). The same intrinsics that a C compiler uses.

No regard for GC of course, the C# and VB.NET compilers don't use these opcodes at all. But the C++/CLI compiler does, a simple example:

using namespace System;

struct s { int a; int b;  };

int main(array<System::String ^> ^args)
{
    s var = {};        // initblk
    s cpy = var;       // cpblk
    return 0;
}

Optimized MSIL:

.method assembly static int32  main(string[] args) cil managed
{
  // Code size       34 (0x22)
  .maxstack  3
  .locals ([0] valuetype s cpy,
           [1] valuetype s var)
  IL_0000:  ldloca.s   var
  IL_0002:  ldc.i4.0
  IL_0003:  ldc.i4.8
  IL_0004:  initblk
  IL_0006:  ldloca.s   cpy
  IL_0008:  ldloca.s   var
  IL_000a:  ldc.i4.8
  IL_000b:  cpblk
  ...
}

The current .NET jitters generate inline code with simple register moves for small structures, REP STOS/MOVS for large ones. Very similar to what Buffer.Memcpy() does.

Hans Passant
  • 922,412
  • 146
  • 1,693
  • 2,536
  • But is that the native `memcpy`, without the fixups for GC integration? Or one provided by the CLR? I'm seeing `#define memcpy` in some of the headers. – Ben Voigt Dec 26 '14 at 02:47
  • Also, I assume `Buffer.Memcpy` was supposed to be `Buffer.BlockCopy`? I was actually just after a version of `Buffer.BlockCopy` which isn't artificially limited to primitives (structures of primitives can be safely blitted also, but `Buffer.BlockCopy` throws an exception instead of doing its job) – Ben Voigt Dec 26 '14 at 02:54
  • 1
    Fjit doesn't tinker with memcpy(). Or memset(). No, I meant the internal Buffer.Memcpy() method, just look at the Reference Source. Wishing GC semantics and non-blittable object layout away is a non-starter. – Hans Passant Dec 26 '14 at 12:26
  • 1
    I'm not wishing non-blittable layout away. I'm wishing that Buffer.BlockCopy would correctly identify blittable value types. Right now it just does IsPrimitive, which is a very conservative test and gives wrong results for (C# example) struct Triple { int x, y, z; } – Ben Voigt Dec 26 '14 at 20:11
  • @HansPassant: Bottom line, I didn't understad: Is pinning necessary? – Yaakov Shoham Dec 30 '15 at 12:01
  • @Y.Shoham it's necessary that the locations aren't going to move. If they're on the heap, then pinning is necessary. If they're on the stack, then pinning is never necessary. – Jon Hanna Nov 29 '17 at 21:16
  • 1
    I've done some empirical testing on the latest .NET Framework 4.7.2 bits. You don't need to pin managed pointer inputs since the entire `cpblk` operation is GC protected by being sealed within a single implicit sequence point of the execution stack. The bigger problem is if the block contains managed references. Turns out, yes `cpblk` does work fine so long as the `src` and `dst` are in the same GC generation. But copying a block containing managed references from within a SOH (<85000 byte) object to a Large Object Heap (LOH) object very reliably produces a (delayed) CLR fatal execution error. – Glenn Slayden Sep 20 '18 at 18:05
  • Sorry, forgot to address @Y.Shoham and @BenVoigt. As for the fatal execution error, I'm considering filing the bug report, though I it seems certain to be `by-design'd`--it was in searching for docs on the intended behavior that I arrived at this page. – Glenn Slayden Sep 20 '18 at 18:09
  • @Y.Shoham As for (e.g.) `aligned. 1 cpblk` vs. `cpblk`, I have yet to find a case in either 32- or 64-bit mode where it makes any difference (note that many docs say the latter is the only case where it *could* matter). But the whole notion is confusing because there are four cases: is the prefix required on mis-aligned source (only), destination (only), either (only), or both (all). And for the last two cases, which mis-alignment skew should be specified as the `.unaligned` value (presumably the lower/worst)? Seems safe enough to put a (vacuous?) `unaligned. 2` on string ops for good measure. – Glenn Slayden Sep 20 '18 at 18:41
  • 2
    At https://github.com/dotnet/coreclr/issues/20086 it is suggested that using `cpblk` on blocks containing managed references isn't a good idea even if it works within the Small Object Heap. Which is back to what @HansPassant was saying, just above, in the first place. – Glenn Slayden Sep 21 '18 at 05:24