1

I’m confused about effective type in the case of pointers to arrays in C. Does accessing an individual member via a pointer to an array impart an effective type only on the memory for that member or across all the memory encompassed by the array? Is the Standard clear in this regard?

int ( *foo )[ 10 ] = malloc( sizeof( *foo ) );
**foo = 123; //Or ( *foo )[ 0 ] = 123

//Is the effective type of the memory for (*foo)[ 0 – 9 ] now also int?
//Does the whole region now have an effective type?
//Or can this memory be used for other purposes?

Here's a practical example:

int (*foo)[ 10 ];
double *bar;

//Figure out the size an int padded to comply with double alignment
size_t padded_int_size =
( ( ( sizeof( int ) + alignof( double ) - 1 ) / alignof( double ) ) * alignof( double ) );

//Allocate memory for one padded int and 1000 doubles,
//which will in any case be larger than an array of 10 ints
foo = malloc( padded_int_size + sizeof( double ) * 1000 );

//Set our double pointer to point just after the first int
bar = (double*)( (char*)foo + padded_int_size );

//Do things with ( *foo )[ 0 ] or **foo
//Do things with bar[ 0 - 999 ]

Does the above code invoke undefined behavior?

I searched online and found that most discussions about aggregate types and effective type concerned struct pointers, not pointers to arrays. Even then, there seems to be disagreement and confusion over whether setting a single struct member imparts an effective type only for that member or for the entire block of memory that the struct would encompass.

Jackson Allan
  • 727
  • 3
  • 11
  • 2
    Re “Is the Standard clear in this regard?”: No. – Eric Postpischil Nov 04 '21 at 18:49
  • Imagine the computer which stores all doubles in a special memory pool only accessible by the FPU. FPU does not have access to the normal memory. Transfers from/to "normal memory" require special machine code instructions. Will your pointer punning work? – 0___________ Nov 04 '21 at 19:06
  • 1
    @0___________ Could such a computer offer malloc in the first place, given that the function must return _contiguous_ memory suitable for use with _any_ type? If so, and the computer can therefore decide to use special memory for certain segments of a malloced block some time after its allocation, then I think your question is essentially the same one I'm asking: Does accessing one member through a pointer to an aggregate type permit the compiler to make type assumptions about all the other members and/or the entire block? – Jackson Allan Nov 04 '21 at 19:19

1 Answers1

1
//Is the effective type of the memory for (*foo)[ 0 – 9 ] now also int?

If you are asking whether the effective type for the whole region is (one) int, then I think it's very hard to make an argument for that. I would be inclined to say no, but as Eric remarked in comments, the language specification is not clear here.

If you are asking whether the effective type of *foo is now int[10], then there is an easier argument for that. I would be inclined to say "yes". In the event that that position is accepted, there is an even stronger argument for the nine int-sized sub regions at the tail of *foo each having effective type int, but that wouldn't be an unassailable position.

//Does the whole region now have an effective type?

See above.

//Or can this memory be used for other purposes?

This is not an exclusive alternative. Any or all of the allocated object can be used for other purposes, regardless of the answers to the previous questions. The effective type of an object without a declared type can be reassigned by writing to that object.

//Do things with ( *foo )[ 0 ] or **foo
//Do things with bar[ 0 - 999 ]

Does the above code invoke undefined behavior?

The computation and assignment of values for foo and bar are fine. The rest depends at least in part on what things are done. If you write all or part of the space via foo and then read that same space back via bar then the behavior is undefined. Other combinations of actions may have less clear definedness.

John Bollinger
  • 160,171
  • 8
  • 81
  • 157
  • "If you write all or part of the space via foo and then read that same space back via bar then the behavior is undefined." <- To be clear, I'm not talking about ever using the same bytes that store an int to store a double of vice-versa. The real-world problem is this: I _must_ use a pointer to an array of type X and length N. However, I _know_ that the array it points to only needs to store one element, and only at index 0. I'm wondering whether I can store other data in the memory of the rest of the array (and beyond if I allocate enough), or whether I must treat it as dead space. – Jackson Allan Nov 05 '21 at 00:55
  • 1
    @JacksonAllan, are you writing for a highly-space constrained environment? Or do you have at any time enough of these arrays of which you need only one element that they are large in the aggregate? Because 9 `int`s worth of space is a small price to pay for confidence in the correctness and portability of your program, and even 9000 `int`s is still fairly insignificant in many modern environments. – John Bollinger Nov 05 '21 at 02:29
  • The basic idea is to tag a pointer with extra, compile-time metadata. The purpose is a usability-focused generic container library whose principles include that the user never has to pre-declare anything use macros – i.e. no `DEFINE_LIST(int,float,int_float_map)` at the top of a file – and never has to typecast or specify types except when declaring a container. When the container only needs to be associated with one type, these principles can be met by hiding it behind a pointer to that type (the user’s handle). But containers like hash tables need to be associated with two types... – Jackson Allan Nov 09 '21 at 01:04
  • ...For such containers, a nameless struct could work but can’t be passed in/out of functions. My idea is that the user’s handle will instead be a pointer-to-array, with the array size specifying the integer code or size of the second type. So the handle for a hash able would be not `val_type *name` but `val_type *(*name)[ key_type_code ]`, with the first pointer in the array pointing to the beginning of the actual data. But that means memory overhead of ( key_type_code – 1 ) * sizeof( val_type* ). If effective-type rules permitted, this overhead could be mostly eliminated... – Jackson Allan Nov 09 '21 at 01:06
  • ...In this regard, the responder to [this post](https://stackoverflow.com/questions/57391249/convert-int-to-intn-in-c-c) seems certain that accessing elements via an array pointer always boils down to an access to an individual element because an array cannot itself be used as a lvalue. But then again, who really knows? – Jackson Allan Nov 09 '21 at 01:12
  • 1
    @JacksonAllan, to the extent that the responder you cite seems certain of their interpretation, I think that certainty is misplaced. And I also think that that in itself is an excellent reason to stay well clear. Hanging your hat on a fine point of interpretation of the spec is an invitation for trouble, even if you're right. – John Bollinger Nov 09 '21 at 04:48
  • Where exactly it is stated in the standard that the effective type in malloced region can be changed by writing into it? – tstanisl Jan 20 '22 at 20:30
  • 1
    @tstanisl, that would be C17 paragraph 6.5/6 (and similar in earlier versions of the language spec): "If a value is stored into an object having no declared type through an lvalue having a type that is not a character type, then the type of the lvalue becomes the effective type of the object for that access and for subsequent accesses that do not modify the stored value". There is a similar provision for writes via `memcpy` or `memmove`. Objects having no declared type include all allocated objects and their subobjects, and I don't think standard C specifies any other such objects. – John Bollinger Jan 20 '22 at 20:51
  • Ok. I got it. I've always thought it was referring to objects with no type (like a brand new result of malloc). However it refers to all writing operations to dynamic objects. Thanks for clarification. – tstanisl Jan 21 '22 at 07:41
  • Yes, @tstanisl. Effective types are not declared types (else the specs wouldn't need to distinguish these categories), so writing to an object without a declared type does not change that object into one with a declared type. Nor do the specs take the available opportunity to exclude objects with (only) effective types from type (re)assignment. 6.5/6 applies to allocated objects no matter how many times they have been written, no matter via what lvalues. – John Bollinger Jan 21 '22 at 14:20