4

Let's say my code is:

typedef stuct {
  int x;
  double y;
  char z;
} Foo;

would x, y, and z, be right next to each other in memory? Could pointer arithmetic 'iterate' over them? My C is rusty so I can not quite get the program right to test this. Here is my code in full.

#include <stdlib.h>
#include <stdio.h>

typedef struct {
  int x;
  double y;
  char z;
} Foo;


int main() {
  Foo *f = malloc(sizeof(Foo));
  f->x = 10;
  f->y = 30.0;
  f->z = 'c';
  // Pointer to iterate.
  for(int i = 0; i == sizeof(Foo); i++) {
    if (i == 0) {
      printf(*(f + i));
    }
    else if (i == (sizeof(int) + 1)) {
      printf(*(f + i));
    }
    else if (i ==(sizeof(int) + sizeof(double) + 1)) {
      printf(*(f + i));
    }
    else {
      continue;
    }
  return 0;
}
Marco Bonelli
  • 63,369
  • 21
  • 118
  • 128
David Frick
  • 641
  • 1
  • 9
  • 25
  • 3
    No, you can't be sure since the compiler may pad fields in the structure. – jmq Jan 21 '20 at 22:32
  • Does this answer your question? [Struct memory layout in C](https://stackoverflow.com/questions/2748995/struct-memory-layout-in-c) – Mark Snyder Jan 21 '20 at 22:36

6 Answers6

15

No, it is not guaranteed for struct members to be contiguous in memory.

From §6.7.2.1 point 15 in the C standard (page 115 here):

There may be unnamed padding within a structure object, but not at its beginning.

Most of the times, something like:

struct mystruct {
    int a;
    char b;
    int c;
};

Is indeed aligned to sizeof(int), like this:

 0  1  2  3  4  5  6  7  8  9  10 11
[a         ][b][padding][c          ]
Marco Bonelli
  • 63,369
  • 21
  • 118
  • 128
13

Yes and no.

Yes, the members of a struct are allocated within a contiguous block of memory. In your example, an object of type Foo occupies sizeof (Foo) contiguous bytes of memory, and all the members are within that sequence of bytes.

But no, there is no guarantee that the members themselves are adjacent to each other. There can be padding bytes between any two members, or after the last one. The standard does guarantee that the first defined member is at offset 0, and that all the members are allocated in the order in which they're defined (which means you can sometimes save space by reordering the members).

Normally compilers use just enough padding to satisfy the alignment requirements of the member types, but the standard doesn't require that.

So you can't (directly) iterate over the members of a structure. If you want to do that, and if all the members are of the same type, use an array.

You can use the offsetof macro, defined in <stddef.h>, to determine the byte offset of (non-bitfield) member, and it can sometimes be useful to use that to build a data structure that can be used to iterate over the members of a structure. But it's tedious, and rarely more useful than simply referring to the members by name -- particularly if they have different types.

Keith Thompson
  • 254,901
  • 44
  • 429
  • 631
  • This answered would have been my next question ie. is there a way to tell the padding then using pointer arithmetic with the padding in mind. Interesting. Thanks. – David Frick Jan 22 '20 at 06:38
1

would x, y, and z, be right next to each other in memory?

No. The struct memory allocation layout is implementation dependent - there is no guarantee struct members are right next to each other. One reason is memory padding, which is

Could pointer arithmetic 'iterate' over them?

No. You can only do pointer arithmetic for pointers to the same type.

artm
  • 17,291
  • 6
  • 38
  • 54
  • 1
    I don't think "the same type" is the issue here. If you've got a struct with three `int`s in it, it isn't okay to use pointer-arithmetic to iterate from the first int to the last even though they are the same type because they are each separate objects. – Christian Gibbons Jan 21 '20 at 22:41
1

would x, y, and z, be right next to each other in memory?

They could be, but don't have to be. The placement of elements in structures is not mandated by the ISO C standard.

In general, compiler will place the elements at some offset that is "optimal" for the architecture it compiles to. So, on 32-bit CPUs, most compilers will, by default, place elements at offsets that are multiples of 4 (as that will make for most efficient access). But, most compilers also have ways to specify different placement (alignment).

So, if you have something like:

struct X {
    uint8_t a;
    uint32_t b;
};

Then offset of a would be 0, but offset of b would be 4 on most 32-bit compilers with default options.

Could pointer arithmetic 'iterate' over them?

Not like the code in you example. Pointer arithmetic on pointers to structures is defined to add/subtract the address with the size of the structure. So, if you have:

struct X a[2];
struct X *p = a;

then p+1 == a+1.

To "iterate" over elements you would need to cast the p to uint8_t* and then add the offset of the element to it (using offsetof standard macro), element by element.

srdjan.veljkovic
  • 2,468
  • 16
  • 24
  • I wouldn't expect to see padding for the example struct you provided. I would change `b` to be of type `uint32_t` or something like that which has stricter alignment requirements than `uint8_t` – Christian Gibbons Jan 21 '20 at 23:00
  • @ChristianGibbons You are correct that `uint32_t` is a better example, updated. But, it is quite possible that even with `uint8_t` there would be padding, for faster access. – srdjan.veljkovic Jan 21 '20 at 23:08
  • I wrote a quick program to test on my specific configuration and `sizeof(struct X)` came out to `2` when both members are `uint8_t`. I'll have to consult the standard to see if there's anything about whether padding is allowed if alignment requirements can be met without padding. – Christian Gibbons Jan 21 '20 at 23:15
  • IANALL, but I have used compilers that would pad in this scenario. Accessing odd addresses may be slower even if you access byte. This is not the case for the x86, but there are _many_ CPUs out there... – srdjan.veljkovic Jan 21 '20 at 23:36
  • I just read the section in the C11 standard working draft on structs, and can confirm nothing in there preventing adding padding even if the type is already aligned. (section 6.7.2.1, if you're interested) – Christian Gibbons Jan 21 '20 at 23:39
0

It depends on the padding decided on by the compiler (which is influenced by the requirements and advantages on the target architecture). The C standard does guarantee that there is to be no padding before the first member of a struct, but after that, you cannot assume anything. However, if the sizeof the struct does equal the sizeof each of its constituent types, then there is no padding.

You can enforce no padding with a compiler-specific directive. On MSVC, that's:

#pragma pack(push, 1)
// your struct...
#pragma pack(pop)

GCC has __attribute__((packed)) for the equivalent effect.

Govind Parmar
  • 20,656
  • 7
  • 53
  • 85
0

There are multiple issues with trying to use pointer arithmetic in this matter.

The first issue, as has been mentioned in other answers, is that there could be padding throughout the struct throwing off your calculations.

C11 working draft 6.7.2.1 p15: (bold emphasis mine)

Within a structure object, the non-bit-field members and the units in which bit-fields reside have addresses that increase in the order in which they are declared. A pointer to a structure object, suitably converted, points to its initial member (or if that member is a bit-field, then to the unit in which it resides), and vice versa. There may be unnamed padding within a structure object, but not at its beginning.

The second issue is that pointer arithmetic is done in multiples of the size of the type being pointed to. In the case of a struct, if you add 1 to a pointer to a struct, the pointer will be pointing to an object after the struct. Using your example struct Foo:

Foo x[3];
Foo *y = x+1; // y points to the second Foo (x[1]), not the second byte of x[0]

6.5.6 p8:

When an expression that has integer type is added to or subtracted from a pointer, the result has the type of the pointer operand. If the pointer operand points to an element of an array object, and the array is large enough, the result points to an element offset from the original element such that the difference of the subscripts of the resulting and original array elements equals the integer expression. In other words, if the expression P points to the i-th element of an array object, the expressions (P)+N (equivalently, N+(P)) and (P)-N (where N has the value n) point to, respectively, the i+n-th and in-th elements of the array object, provided they exist.

A third issue is that performing pointer-arithmetic such that the result points more than one past the end of the object causes undefined behavior, as does dereferencing a pointer to one element past the end of the object obtained through the pointer arithmetic. So even if you had a struct containing three ints with no padding inbetween and took a pointer to the first int and incremented it to point to the second int, dereferencing it would cause undefined behavior.

More from 6.5.6: (bold-italic emphasis mine)

Moreover, if the expression P points to the last element of an array object, the expression (P)+1 points one past the last element of the array object, and if the expression Q points one past the last element of an array object, the expression (Q)-1 points to the last element of the array object. If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined. If the result points one past the last element of the array object, it shall not be used as the operand of a unary * operator that is evaluated.

A fourth issue is that dereferencing a pointer to one type as another type results in undefined behavior. This attempt at type-punning is often referred to as a strict-aliasing violation. The following is an example of undefined behavior through strict-aliasing violation even though the data types are the same size (assuming 4-byte int and float) and nicely aligned:

int x = 1;
float y = *(float *)&x;

6.5 p7:

An object shall have its stored value accessed only by an lvalue expression that has one of the following types:

  • a type compatible with the effective type of the object,

  • a qualified version of a type compatible with the effective type of the object,

  • a type that is the signed or unsigned type corresponding to the effective type of the object,

  • a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object,

  • an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or

  • a character type.

Summary: No, a C struct does not necessarily hold its members in contiguous memory, and even if it did, the pointer arithmetic you still couldn't do what you want to do with pointer arithemetic.

Christian Gibbons
  • 4,272
  • 1
  • 16
  • 29