0

While reading the source code of RocksDB's skiplist, I have found the following code:

  int UnstashHeight() const {
    int rv;
    memcpy(&rv, &next_[0], sizeof(int));
    return rv;
  }

Why it use memcpy? what if use pointer type cast like this:

  int UnstashHeight() const {
    int rv;
    rv = *((int*)&next_[0]);
    return rv;
  }

Does memcpy has better portability on supporting different cpu target? Or there is no difference at all?

Myrfy
  • 575
  • 4
  • 11
  • 2
    Depends on what `next_` is. See [mcve]. – user3386109 May 20 '20 at 15:43
  • 1
    You don't give much to go by really. But most likely it's to avoid undefined behavior related to strict aliasing. Certain uses of memcpy are well-defined while plain punning simply isn't. – StoryTeller - Unslander Monica May 20 '20 at 15:44
  • Most probably, to prevent [strict aliasing](https://stackoverflow.com/questions/98650/what-is-the-strict-aliasing-rule) violation. – Evg May 20 '20 at 15:44
  • 3
    Some acrhitectures may require pointer alignment to access machine word (none for byte access). – Matt May 20 '20 at 15:46
  • 1
    Your second code example is invoking undefined behavior. Type punning is explicitly disallowed by the standard due to “strict aliasing.” That’s the difference. Also, as @Matt said, some architectures (like older ARM versions) will refuse to read an unaligned memory address; They may helpfully throw an exception, but they may also not. The reason for which is that an unaligned memory access requires at least two reads from memory, and then some logic to decode and shift the bits around to get the value (so to lower the memory circuit’s complexity, some architectures literally *can’t*). – Cole Tobin May 20 '20 at 16:09
  • For byte (octet) quantities that fit inside a processor's register, `memcpy` will at a minimum require the overhead of function call and return. The call to `memcpy` may be replaced by assignment, depending on the compiler's optimization skills and the optimization compiler setting. In this case, assignment is more efficient. – Thomas Matthews May 20 '20 at 18:09
  • The `memcpy` function may be optimized for a processor's instruction set. For example on a 32-bit processor the `memcpy` may use a loop of 32-bit assignment operations when the length is multiple of 32-bits. The `memcpy` may use specialized processor instructions for block copying. The `memcpy` could also use a DMA processor too. All depends on many factors surrounding the copy. – Thomas Matthews May 20 '20 at 18:13
  • IMHO, `memcpy` should be avoided. Pointers and references exist to avoid having to copy blocks of memory. The `memcpy` produces a mirror image of `struct` and `class`, but may perform a depth copy (like when the `struct` or `class` has pointers). Yes, I know there are instances where data copying is mandatory; but the majority of the time copying large data structures can be avoided. – Thomas Matthews May 20 '20 at 18:15

1 Answers1

0

I would say:

  • memcpy is a function, in every way you look at it is not as simple as a machine instruction that should resolve the assignment
  • the assignment could be optimized by the compiler in some context and simply share the same value in memory between multiple variables declared in your code (obviously if it makes sense)
  • as being typically mapped on a single machine instruction it has the constraints of the platform it belongs to. As stated in a comment, ARM processor requires data to be aligned to 2,4,8 bytes according to the data we are handling (4 bytes/32bit in that case). If the constraint is not satisfied an interrupt is raised.
  • memcpy works great on data coming from network or codecs (besides big-endian and little-endian issues) where structures try to use less bytes and bits as possible. That said your WORD can be not aligned to 32-bit boundaries, but memcpy will take care of this.
  • the assignment target is an instance of an int, the assignment then does not require a check on the destination: it's allocated and valid by definition (the compile guarantee that). The memcpy destination is a pointer, if you use that function widely, you may need to start to check if the destination ptr is not null around in your code.

That said, maybe I'm a little outside your target but there is not enough code to judge the formalism used for the assignment.

My 2 cents

Stefano Buora
  • 1,052
  • 6
  • 12
  • Is the interrupt raised when fetching on odd boundaries? My understanding is that the ARM processor would require an additional fetch, but no interrupts would be raised. I haven't encountered the interrupt raising after using ARM processors for over 10 years. – Thomas Matthews May 20 '20 at 18:18
  • This also depends on the configuration of the ARM processor. On a pure 32-bit configuration, the processor always fetches 32-bits. It may perform some shifting after it reads the data (such as when fetching 16-bit or 8-bit quantities). The 8/32-bit ARMs have different fetching strategies as well as the 8/16/32-bit. If the ARM process is configured for 16-bits, it may always fetch 16-bit quantities and make 2 fetches to build a 32-bit quantity. Again, depends on the ARM model and its configuration. – Thomas Matthews May 20 '20 at 18:21
  • I have had issues while performing deserialization of data like savegames. In such scenarios, the data was not exported performing a memcpy of the data structure (hence preserving the data alignment forced by the compile of the struct size as well of its members) but picking up the relevant data from different structs.That kind of sampling don't guarantee the alignment to 32 bit addresses of integers (32 bits) in the serialized archive causing trouble during the "next" load phase.Maybe it's matter of configuration of the processor but on mobile devices it happens.(were they all odd addresses?) – Stefano Buora May 20 '20 at 21:17