0

I am trying to set the values of a struct object equal to another struct object of the same type using void pointers. The values don't come out as expected when using different size uints.

A simple example that demonstrates the problem, let's say I have the struct:

struct testUint{
    uint32 m_num1;
    uint32 m_num2;
    uint32 m_num3;
};

In main I do:

testUint test_uint1;
testUint test_uint2;

test_uint2.m_num1 = 3;
test_uint2.m_num2 = 6;
test_uint2.m_num3 = 9;

void* p = &test_uint1;

*(uint32*) p = test_uint2.m_num1;
p = (uint32*)p+1;

*(uint32*) p = test_uint2.m_num2;
p = (uint32*)p+1;

*(uint32*) p = test_uint2.m_num3;
p = (uint32*)p+1;

cout << test_uint1.m_num1 << test_uint1.m_num2 << test_uint1.m_num3 <<endl;

The output comes out as expected as: 3, 6, and 9

But when I change the struct to:

struct testUint{
    uint32 m_num1;
    uint64 m_num2;
    uint32 m_num3;
};

and keep the same code as before except change the middle portion to be:

*(uint32*) p = test_uint2.m_num1;
p = (uint32*)p+1;

*(uint64*) p = test_uint2.m_num2;
p = (uint64*)p+1;

*(uint32*) p = test_uint2.m_num3;
p = (uint32*)p+1;

The output becomes: 3, 38654705664, and 3435973836.

I am not really sure why it is behaving like this since I believe I am incrementing by the proper number of bytes.

pepe1
  • 205
  • 1
  • 3
  • 18
  • Please post a [mcve]. – R Sahu Jul 11 '18 at 18:46
  • However you spin it, this code is not portable and not standard, do you actually have a use case for it? If not i'd just bin it off. – George Jul 11 '18 at 18:47
  • Between `m_num1` and `m_num2` in your second structure there will most likely be padding. Also incrementing pointer past the end of the object it is pointing to and then dereferencing the pointer is Undefined Behaviour. – Richard Critten Jul 11 '18 at 18:47
  • You can't use pointer arithmetics to hope between objects like that. These operation are only supported for objects that belong to the same array. – François Andrieux Jul 11 '18 at 18:47
  • @FrançoisAndrieux right, of course, pardon me – alter_igel Jul 11 '18 at 18:49
  • @alterigel I was actually mistaken, and you were correct. I hadn't noticed the `uint64` mixed with the `uint32`s in the second part of the question. – François Andrieux Jul 11 '18 at 18:49
  • 4
    The duplicate is not applicable. This question is not performing pointer arithmetic on a `void*`. The pointer is always cast before incrementing it here. C casts occur before `operator+`. – François Andrieux Jul 11 '18 at 18:51
  • 4
    You're not accounting for padding within the structure, which is not specified by the standard. P.S. the question is reproducible: https://ideone.com/CiFRif – Mark Ransom Jul 11 '18 at 18:52

1 Answers1

3

To start with, I'm going to ignore the fact that your code is engaging in some very significant undefined behavior, and jump straight to the part where in a UNIX/Windows environment, the behavior of your code is relatively predictable (if not standard compliant).

Your code is assuming that the layout of the version that uses uint64_t for its second member looks like this (2 characters == 1 byte):

-1-+-+-+-5-+-+-+-+10-+-+-+-+15-+-+-+-+20-+-+-+-+25
11111111222222222222222233333333__________________

But in actuality, due to padding, it's getting laid out like this:

-1-+-+-+-5-+-+-+-+10-+-+-+-+15-+-+-+-+20-+-+-+-+25
11111111........222222222222222233333333__________

This means that when you're assigning values into the struct, you're getting values like this (assuming Little Endian, based on your results):

-1-+-+-+-5-+-+-+-+10-+-+-+-+15-+-+-+-+20-+-+-+-+25
03000000060000000000000009000000????????__________

This means the 6 is getting written into padding, and isn't readable when accessing the members directly. Meanwhile, the 9 is getting written inside m_num2, and m_num3 is receiving complete garbage. 0x0000000009000000 in little-endian hexadecimal converts to 38654705664 in decimal, which is why that's what you get for your second value. And because the third value is garbage, it could be literally anything, and 3435973836 is just what you happened to get on this particular execution.

Now, back to the Undefined Behavior thing: this is why you shouldn't write code like this. Because the padding being used in this struct is implementation defined (for good reason), and it's bad to depend on behavior like this to ascertain the correctness of your code.

If you absolutely do need to depend on bit-twiddling things like this, there are a few things you should be doing:

  • Use char* or uint8_t*, not void*: arithmetic on a void* is not allowed in C++, and even though you're not technically performing arithmetic on a void* in your code, you're still leaving your code in a kludgy state that would be vulnerable to code alterations that would perform a void* arithmetic.
  • Use idioms for accessing the specific offset of a given member: for example, offsetof.

A better version of your code looks like this:

#include<iostream>
#include<cstddef>
#include<cstdint>

struct testUint{
    uint32_t m_num1;
    uint64_t m_num2;
    uint32_t m_num3;
};

int main() {
    testUint test_uint1;
    testUint test_uint2;

    test_uint2.m_num1 = 3;
    test_uint2.m_num2 = 6;
    test_uint2.m_num3 = 9;

    //Prefer reinterpret_cast, not raw C-style casts
    uint8_t * p = reinterpret_cast<uint8_t*>(&test_uint1);

    *reinterpret_cast<uint32_t*>(p + offsetof(testUint, m_num1)) = test_uint2.m_num1;
    *reinterpret_cast<uint64_t*>(p + offsetof(testUint, m_num2)) = test_uint2.m_num2;
    *reinterpret_cast<uint32_t*>(p + offsetof(testUint, m_num3)) = test_uint2.m_num3;

    //Don't use 'using namespace std;'
    std::cout << test_uint1.m_num1 << ' ' << test_uint1.m_num2 << ' ' << test_uint1.m_num3 << std::endl;
}

A much better solution finds a way to make the interface of testUint visible to the scope where p exists, and avoid this pointer casting altogether.

Xirema
  • 19,889
  • 4
  • 32
  • 68
  • This may be unrelated to the original question but I was wondering if it would be possible to set the values of test_uint1 without knowing the names of the member variables? Would still know the type and order of them though. – pepe1 Jul 12 '18 at 12:17
  • 1
    @pepe1 If you correctly calculate the offsets, it would be possible. But like I showed in my post, you have to be careful with what you assume/determine those offsets to be. Your code failed because your assumptions about the layout of the memory of `testUint` were incorrect, but if you had been calculating the correct memory locations, it would have worked (in a way that is technically undefined behavior). – Xirema Jul 12 '18 at 14:08