11

I was attempting to explain to a co worker a concept and came to the realization I was incorrect in my understanding.

How are structs with arrays embedded assignable?

For example:

typedef struct {
    uint8_t data[8];
} Test;
...
Test test1;
Test test2;
... some assignment to test1
test2 = test1;

I know if data was of type pointer that we would need to implement a deep copy but I'm trying to understand fully the way this works.

My though process is that as 'data' would normally be the pointer to the first element and that &data would be the address of that pointer. In the case of the struct is the struct address what the compiler is using to access the array?

Can someone explain the language mechanism that allows this. Is this just syntactic sugar for c structs? If so, why not implement direct array assignment like so...

uint8_t data[10];
uint8_t data2[10];
...
data2 = data;

Why after years of C programming am I having an existential language crisis about a mechanism I have used but never understood?

gettingSmarter
  • 661
  • 1
  • 5
  • 15
  • 3
    `data` is an array, NOT the pointer to the first element and is converted to a pointer to first element in expression except for operator `sizeof` and unary `&`. `&data` will be the address of array, not address of pointer. – MikeCAT Mar 03 '16 at 01:37
  • so, it would be better to think of test1.data[0] semantically like an access into the struct than an access into the array? – gettingSmarter Mar 03 '16 at 01:41
  • @MikeCAT: You forgot `_Alignas`. – too honest for this site Mar 03 '16 at 01:43
  • The C tag implies standard C which is C11 _only_. And this is not only read by you or me. (also please use `@name` to address a comment; I just happened to still read in this thread). Anyway, **please** keep in mind that if an array was a pointer, it would be called "pointer", not "array". Take @MikeCAT s comment very seriously! It also is true when using the index-operator `[]`. There is no black magic and C is actually very straight forward here. However, why do you think an array inside a `struct` would behave differently than an `int`? – too honest for this site Mar 03 '16 at 01:50
  • @Olaf, you are correct; comment removed. I think considering the normal use of array's outside of structs and their lack of assignabilty would explain the misunderstanding. – gettingSmarter Mar 03 '16 at 01:57
  • @gettingSmarter: I actually wasn't up for removing your comment, but to clarify. As a personal note, I think it is a good idea to use C11 features. But for your problem, I agree C99 is fine, too (I don't care about C90 anymore - don't use MSVC). – too honest for this site Mar 03 '16 at 02:00
  • I'm inclined to think that the reason that (direct) array assignment is not supported in C is that it is inconsistent with the aforementioned decay of array values to pointers in most contexts, notably including when an array appears as an operand of the `=` operator. When you wrap an array in a `struct`, however, the same does not apply to the whole `struct`. – John Bollinger Mar 03 '16 at 02:19

3 Answers3

13

Why after years of C programming am I having an existential language crisis about a mechanism I have used but never understood?

You always misunderstood arrays and now this has brought it to light :)

The actual rules are:

  1. Arrays are different to pointers; there is no "implied pointer" or anything in an array. The storage in memory for an array consists of exactly the cells with the array contents and nothing more.

  2. When you use the array's identifier in an expression, then the value of that expression is a (temporary) pointer to the array's first element. (With a handful of exceptions that I omit for brevity).

    2a. (in case this was unclear) Expressions have values , and the value of an expression does not require storage. For example in the code f(1 + 1), the value 2 is a value but it is not in an object and, conceptually, it is not stored anywhere. The pointer mentioned above is the same sort of value.

The reason you cannot write:

data2 = data;

is because Rule 2 kicks in , the value of the right-hand side is a pointer, and the assignment operation is not defined between an array and a pointer. (It wouldn't know how many units to copy).

The language designers could have added another exception to Rule 2 so that if the array is the sole right-hand operand of = then value conversion doesn't occur, and the array is assigned by value. That would be a consistent rule and the language would work. But they didn't.

The structure assignment does not trigger Rule 2 so the array is happily copied.

In fact they could have done away with Rule 2 entirely, and the language would still have worked. But then you would need to write puts(&s[0]); instead of puts(s); and so on. When designing C (incorporating BCPL which I think had a similar rule) , they opted to go for including Rule 2, presumably because the benefits appeared to outweigh the negatives at the time.

M.M
  • 138,810
  • 21
  • 208
  • 365
3

Assigning from one struct to another does an element-by-element copy of the struct's members. I think your problem is in overthinking the concept of an "element-by-element copy" operation. If you tried to do your own copy using the assignment operator on each individual element, then you would indeed run into the problem with not being able to copy the array. When you do a direct struct assignment, though, the compiler knows what code to emit to handle the internal array correctly. It's not simply syntactic sugar on top of using the assignment operator on each member.

bta
  • 43,959
  • 6
  • 69
  • 99
  • 1
    @gettingSmarter- I meant that in the sense of it's not a simple, predictable transform of the code, as in the way that array notation or accessing structure members are just a different way to write pointer arithmetic. – bta Mar 03 '16 at 01:51
1

The struct name for a struct containing a fixed-length array is treated as a contiguous object and therefore assignable, while an array name is interpreted as the address of the first element except in the case where it is the operand of the sizeof operator and unary & operator.

merlin2011
  • 71,677
  • 44
  • 195
  • 329
  • 1
    an array name is interpreted as the address of the first element *unless it is not an operand of `sizeof` and unary `&`*. This is one reason why `char hoge[]="hoge"; printf("%s\n", &hoge);` is invalid. – MikeCAT Mar 03 '16 at 01:40
  • 1
    I get the above merlin2011. I was looking for a deeper answer such as that provided in the comment by MikeCAT. – gettingSmarter Mar 03 '16 at 01:43
  • @gettingSmarter, The address of the `struct` is interpreted literally, while there is special casing for taking the address of an array name. – merlin2011 Mar 03 '16 at 01:45