1

I have a struct named Characters, a variable of that struct named a, and a pointer to char named pChar and I want to change the value of FirstChar in a to 27 through pChar

struct Characters {
    short CharCount;
    char FirstChar;
    char SecondChar;
    char ThirdChar;
};

struct Characters a = {3, 4, 5, 6};

char* pChar;
pChar = (char*)&a;
*(pChar+2) = 27;

Can I cast the struct a to char* safely? And will the padding be added to the end of the struct? If I used *(pChar+2) will I be guaranteed that it will be referring to FirstChar?

Mike32ab
  • 466
  • 1
  • 3
  • 13
  • Do you mind saying why you need to do this? You can probably get it to work, but you might not have to, if there's a better way of accomplishing your actual goal. – Steve Summit Nov 04 '19 at 17:45

2 Answers2

8

Can I safely cast the struct a to char* safely?

The cast itself is safe. Accessing the contents of the pointer *(pChar+2) = 27; is not safe. Formally it may invoke undefined behavior because it violates aliasing.

(Using char* is fine as far as the aliasing rules are concerned, but when writing the value 27 to a random byte, there are formally no guarantees of what the result will be. And you might be writing to a padding byte.)

And will the padding be added to the end of the struct?

Padding may be added anywhere in the struct, except at the very beginning.


In practice, doing things like this is likely rather safe, even though it is not recommended and formally undefined behavior. I have yet to encounter a system where it wouldn't work safely and deterministically. But you have to ensure that there is no padding:

struct Numbers {
    short CharCount;
    char FirstChar;
    char SecondChar;
    char ThirdChar;
};

static_assert(sizeof(struct Numbers) == 
                sizeof(short) +
                sizeof(char)  +
                sizeof(char)  +
                sizeof(char), 
              "Padding found");

To actually disable padding there is usually some non-standard way such as #pragma pack(1).

EDIT

The best and most portable way to reliably get the address of a struct member might be this:

#include <stddef.h>

struct Numbers x;
char* ptr = (char*)&x + offsetof(struct Numbers, FirstChar);
Community
  • 1
  • 1
Lundin
  • 195,001
  • 40
  • 254
  • 396
  • in the "check if there's a padding" part, wouldn't it be `3*sizeof(char)` (== `3` since `sizeof(char)==1`) clearer/more correct, since we have three chars and not an array of three chars? – ShinTakezou May 27 '15 at 06:42
  • @ShinTakezou Yeah either works, I was being lazy. The most readable form is perhaps `sizeof(short)+sizeof(char)+sizeof(char)+sizeof(char)` because then the intent is perfectly clear. – Lundin May 27 '15 at 06:45
  • it would "map" directly to the struct, but mathematically `x + x + x` is `3x`, so it is shorter without breaking the "map", thus looks clearer. Just irrelevant, but `sizeof(char[3])` seemed wrong (even if it is not, I mean), because there are no arrays in the struct. - **seen your edit**, the formatting makes the "map" explicit and so I agree it looks clearer wrt the struct itself. – ShinTakezou May 27 '15 at 06:48
  • If there was no padding, for example if this struct was 4 bytes and made up of only members CharCount, FirstChar and SecondChar, and I used *(pChar + sizeof(short)) = 27 would I be guaranteed that it will be referring to FirstChar? – Mike32ab May 27 '15 at 21:06
  • @Rm32a Portably, you can only ensure that there is no padding by using a compiler option such as `#pragma pack` combined with a static assert as I showed. If the static assert doesn't kick in and you only use character type for the pointer arithmetic (as you do), then I'd say it is guaranteed to work. However, I just now realized there are actually better ways, see edit. – Lundin May 28 '15 at 07:39
3

You can cast any pointer type into another pointer type and should still work in the sense that it points to the same memory space, if thats what you are asking.

However, its hard to determine whats going to happen if you modify a variable in the structure the way you are doing, mainly because of memory alignment.

You can suppress memory alignment in most compilers by setting #pragma pack(1).

If you do so, and char is 1 byte long and short is 2 bytes long (it might not always be), only then its guaranteed that modifying pChar[2] will change the value of FirstChar.

Havenard
  • 27,022
  • 5
  • 36
  • 62
  • When I used sizeof(Characters) it returns 6. In this specific case it seems like the 1 byte padding will be added at the end since the short variable CharCount determines the alignment. Are struct members assigned to consecutive memory locations and are they aligned in the order they're specified in the struct declaration? – Mike32ab May 27 '15 at 06:35
  • It might not be being added in the end. Padding can happen between the members of the structure. – Havenard May 27 '15 at 06:36
  • 5
    @Rm32a Struct members have to be allocated in the order they are declared, although there may be any number of padding bytes between them or at the end. And you can't really predict where, in advance. – Lundin May 27 '15 at 06:37
  • 2
    *You can cast any pointer type into another pointer type and should still work in the sense that it allow you to read the same memory space* [false, strict aliasing rules exist](http://stackoverflow.com/a/7005988/1013719). You can't make any type of pointer point to whatever you want, much less dereference it – Ryan Haining May 27 '15 at 06:48
  • 1
    @RyanHaining Well, yeah... but character pointers are excluded from those rules actually. It has always been a bit muddy. – Lundin May 27 '15 at 06:49
  • Is the location where the padding is added based on the members of the struct and the order of declaration of those stuct members? Where do you think the padding will be added for this specific struct? – Mike32ab May 27 '15 at 06:53
  • 1
    @Rm32a There are many factors for where and why it happens, it should depend on the architecture, but it mostly depends on the compiler. Some compilers will pad everything to be addressed exactly 4 bytes apart and that struct would have 16 bytes of size. Aparently in your case its aligning by 2 bytes and interpreting the structure as `short, char[3]`, which would explain why its not padding between the chars. – Havenard May 27 '15 at 06:58
  • Should I just add a fourth char variable after ThirdChar to remove the padding uncertainty, or should I just set the memory alignment to 1? If I set the memory alignment to 1 will that have any effect on program speed? – Mike32ab May 27 '15 at 07:14
  • @RyanHaining Those rules are laughted at even in standard libs, such as `socket.h` where its common practice to cast `struct sockaddr_in*` to `struct sockaddr*` for instance. Theres nothing wrong about doing that. The reason the documentation deems it undefined behaviour is because its impossible to determine in a documented way a defined behaviour for whats going to happen if you recast an struct to a completely different type in shape and size and try to use it as if all was well. But there are well known behaviours that can be relied on as long as you know what you are doing. – Havenard May 27 '15 at 07:16
  • 1
    `socket.h` is not part of the C standard library, it depends on behavior defined by posix, not by C. Your claim is flawed. – Ryan Haining May 27 '15 at 16:50
  • @RyanHaining The behaviour of the structure data types is not defined by POSIX. – Havenard May 27 '15 at 16:52
  • 1
    @Lundin Yeah I know, and it works with OPs question, but this answer makes a much much broader claim. – Ryan Haining May 27 '15 at 16:52