8

The malloc() function returns a pointer of type void*. It allocates memory in bytes according to the size_t value passed as argument to it. The resulting allocation is raw bytes which can be used with any data type in C(without casting).

Can an array with type char declared within a function that returns void *, be used with any data type like the resulting allocation of malloc?

For example,

#include <stdio.h>

void *Stat_Mem();

int main(void)
{
    //size : 10 * sizeof(int)
    int buf[] = { 1,2,3,4,5,6,7,8,9,10 };

    int *p = Stat_Mem();

    memcpy(p, buf, sizeof(buf));

    for (int n = 0; n < 10; n++) {
        printf("%d ", p[n]);
    }
    putchar('\n');

    return 0;
}

void *Stat_Mem()
{
    static char Array[128];
    return Array;
}
machine_1
  • 4,266
  • 2
  • 21
  • 42
  • 2
    No. An array like this `char array[n];` *has* a declared type: Array of char. Accessing it through an lvalue of a type other than `char` (or another character type) is undefined. `[m/c/re]alloc()` are special in that they allocate memory *without* a declared type. – EOF Jul 21 '16 at 17:23
  • 5
    `char*` can alias any type. But not vice versa. Basically you are trying to access `char*` as `int*` here, which is a violation of this rule. – Eugene Sh. Jul 21 '16 at 17:24
  • 2
    @machine_1 It is not valid that the static character array is aligned appropriately for the casted type. – Vlad from Moscow Jul 21 '16 at 17:24
  • 4
    You can return any pointer as `void *`. But you must only dereference it with its effective type! – too honest for this site Jul 21 '16 at 17:25
  • @EOF then how is the malloc array be used with all the data types?? – machine_1 Jul 21 '16 at 17:26
  • Do we have the *strict aliasing* topic in the [Documentation](http://stackoverflow.com/documentation) yet? – Eugene Sh. Jul 21 '16 at 17:27
  • @EOF are you saying reinterpret_cast<> is undefined? – David Thomas Jul 21 '16 at 17:29
  • 1
    @machine_1: C11 draft standard n1570: *6.5 Expressions 6 The effective type of an object for an access to its stored value is the declared type of the object, if any. 87) If a value is stored into an object having no declared type through an lvalue having a type that is not a character type, then the type of the lvalue becomes the effective type of the object for that access and for subsequent accesses that do not modify the stored value.* – EOF Jul 21 '16 at 17:29
  • @DavidThomas: `reinterpret_cast<>` is not part of C. – EOF Jul 21 '16 at 17:30
  • @EOF I stand corrected. – David Thomas Jul 21 '16 at 17:31
  • @EOF Just for clarification of the quote. Once you have filled the allocated memory with some struct data, it's effective type becomes this struct type. Right? – Eugene Sh. Jul 21 '16 at 17:39
  • 1
    @EugeneSh.: Until you write to the memory through a different type, yes. – EOF Jul 21 '16 at 17:40
  • It is legal to xplicitly cast a pointer to a different type. EOF asserted that this kind of reinterpretation was undefined. The statement needs to include alignment to be accurate. If the programmer assures proper alignment it is not undefined. – David Thomas Jul 21 '16 at 17:40
  • @EOF That's an interesting point. Not sure I understand it fully. Can I allocate some memory, cast the pointer to `T1*`, fill it with `T1` data and then *cast* it back to `void*`, then to `T2*` and write `T2` there? – Eugene Sh. Jul 21 '16 at 17:43
  • @DavidThomas I assert that *accessing* an array of `char` though an lvalue of type `int`, as the OP's question does, is undefined. C11 draft standard n1570: *6.5 Expressions 7 An object shall have its stored value accessed only by an lvalue expression that has one of the following types: 88) — a type compatible with the effective type of the object,[...] — a character type.* – EOF Jul 21 '16 at 17:44
  • 1
    @EugeneSh.: Yes, you can. – EOF Jul 21 '16 at 17:45
  • @EOF I contribute " If a value is stored into an object having no declared type through an lvalue having a type that is not a character type, then the type of the lvalue becomes the effective type of the object for that access " per 6.5.6. `char a[16]; int foo = *(int *)(void *)a;` Furthermore, `*(int*)(void*)a = 10;` E.g. This is precisely how big/little endian conversions are performed. – David Thomas Jul 21 '16 at 18:04
  • @DavidThomas Did you read my very first comment on this question? – EOF Jul 21 '16 at 18:06
  • The OP has erased the type by the return value of Stat_mem. This makes it an object of no declared type. – David Thomas Jul 21 '16 at 18:08
  • But it *is* declared. Right there, in the code. – Eugene Sh. Jul 21 '16 at 18:09
  • @DavidThomas: What? No. Casting a pointer does not change the declared type of the object it points to. – EOF Jul 21 '16 at 18:10
  • @EOF Are you on C docs? – 2501 Jul 21 '16 at 18:10
  • @2501 Pardon me? I'm not sure I understand what you mean. – EOF Jul 21 '16 at 18:11
  • @EOF Are you on documentation beta? – 2501 Jul 21 '16 at 18:11
  • @2501: I've looked into it and requested a topic, but haven't really contributed yet. Why? – EOF Jul 21 '16 at 18:12
  • @EOF I have posted some docs for strict aliasing in C. – 2501 Jul 21 '16 at 18:15

3 Answers3

6

The declared type of the static object Array is char. The effective type of this object is it's declared type. The effective type of a static object cannot be changed, thus for the remainder of the program the effective type of Array is char.

If you try to access the value of an object with a type that is not compatible with, or not on this list1, the behavior is undefined.

Your code tries to access the stored value of Array using the type int. This type is not compatible with the type char and is not on the list of exceptions, so the behavior is undefined when you read the array using the int pointer p:

printf("%d ", p[n]);

1 (Quoted from: ISO:IEC 9899:201X 6.5 Expressions 7 )
An object shall have its stored value accessed only by an lvalue expression that has one of the following types:
— a type compatible with the effective type of the object,
— a qualified version of a type compatible with the effective type of the object,
— a type that is the signed or unsigned type corresponding to the effective type of the object,
— a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object,
— an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or
— a character type.

2501
  • 25,460
  • 4
  • 47
  • 87
  • Is there *anything* legal can be done with dereferencing `p` after `memcpy(p, buf, sizeof(buf));` ? – Eugene Sh. Jul 21 '16 at 19:24
  • @EugeneSh. Only if you're using allocated storage duration. – 2501 Jul 21 '16 at 19:25
  • Not sure I understand this sentence. – Eugene Sh. Jul 21 '16 at 19:26
  • 1
    @EugeneSh. Look up *storage duration* in the Standard. – 2501 Jul 21 '16 at 19:27
  • Out of curiosity, can you think of any way in which code might take advantage of c11's new _Alignas qualifier without relying upon behaviors where the Standard imposes no requirements? – supercat Jul 22 '16 at 21:33
  • @Seb: Compilers don't need permission to align things in whatever fashion they would consider optimal; some ABIs might imply that structures should be laid out in less-than-optimal fashion, but the Standard doesn't consider such issues. By far the most common scenario where aligning something more coarsely than would be required for its own type would be useful occurs with implementations that allow arrays of static or automatic duration to be used for type-agnostic data storage--something any platform should be able to support with zero loss of efficiency. – supercat Jul 26 '16 at 20:15
  • @supercat Indeed they don't. You'd be able to cite that with a quotation if only you knew where in the standard to look ;) You're aware that the most common *platform*s used all carry a cost when data is misaligned, right? It seems like you're confused... Have you ever designed a 32-bit bus that aligns to non-32-bit-aligned regions without cost before? You would be paying much more for such a mainboard... – autistic Jul 29 '16 at 07:50
  • @Seb: No object can *require* alignment coarser than its size, but objects may nonetheless *benefit* from such alignment. For example, an implementation might be faster to load 16-bit values which are known to be 32-bit aligned than those which are not. In the absence of any ABI constraints, I'm not sure what generally-useful optimizations an implementation would be allowed to perform given _Alignas which it would not be able to perform in the absence of such a directive. If _Alignas was applicable to pointer-target types, I could see some very useful optimization opportunities there, ... – supercat Jul 29 '16 at 16:44
  • ...but that would require that the Standard include rules about pointer compatibility which it presently lacks (e.g. would a `void _Alignas(int) *foo` be required to have the same representation as a `void *foo` even in platforms where an `int*` would be smaller than a `char*`?). Is the qualifier valid for pointers? If not, what optimizations would it facilitate? – supercat Jul 29 '16 at 16:48
  • @supercat How would you expect a compiler to know how to optimise for a platform when it hasn't been told how? Those are the situations this would be useful for. Indeed, you could use a union instead. On a vaguely related note, you could use casting and pointer arithmetic to determine the size of an object, too, but most people are fine and dandy with `sizeof`... – autistic Jul 29 '16 at 17:45
  • @Seb: I suppose it's possible that cache-line alignment might be more important for some objects than others, and _Alignof might help with that, but being able to declare a pool of buffers that can be used to hold things of any type whose alignment is no coarser than a particular type seems like it would be far more useful on implementations that don't interpret aliasing rules as preventing that. – supercat Jul 31 '16 at 21:15
  • @Seb supercat Please use chat: https://chat.stackoverflow.com/rooms/54304/c – 2501 Jul 31 '16 at 21:17
3

No you cannot use an arbitrary byte array for an arbitrary type because of possible alignment problems. The standard says in 6.3.2.3 Conversions/Pointers (emphasize mine):

A pointer to an object or incomplete type may be converted to a pointer to a different object or incomplete type. If the resulting pointer is not correctly aligned for the pointed-to type, the behavior is undefined. Otherwise, when converted back again, the result shall compare equal to the original pointer.

As a char as the smallest alignment requirement, you cannot make sure that your char array will be correctly aligned for any other type. That is why malloc guarantees that a buffer obtained by malloc (even if it is a void *) has the largest possible alignement requirement to be able to accept any other type.


I think that

union {
    char buf[128];
    long long i;
    void * p;
    long double f;
};

should have correct alignment for any type as it is compatible with largest basic types (as defined in 6.2.5 Types). I am pretty sure that it will work for all common implementations (gcc, clang, msvc, ...) but unfortunately I could not find any confirmation that the standard allows it. Essentially because of the strict aliasing rule as defined in 6.5 Expression §7:

An object shall have its stored value accessed only by an lvalue expression that has one of the following types:

  • a type compatible with the effective type of the object,
  • a qualified version of a type compatible with the effective type of the object,
  • a type that is the signed or unsigned type corresponding to the effective type of the object,
  • a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object,
  • an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or
  • a character type.

So IMHO there is not portable and standard conformant way to build a custom allocator not using malloc.

Serge Ballesta
  • 143,923
  • 11
  • 122
  • 252
  • The alignment is not relevant here. This is because this would still be undefined if the alignment were the same. The problem is effective type of the static char array, which cannot be changed to int. See here: http://stackoverflow.com/a/38510909/4082723 – 2501 Jul 21 '16 at 18:31
  • 1
    @2501: That link goes to a deleted answer, not everyone can see it. – Dietrich Epp Jul 21 '16 at 18:46
  • @2501: Alignment is something visible on ARM architectures: you get errors if you do not respect it, even non language lawyers can understand it. I know that converting a pointer to one type to a pointer to another type gives a pointer that you can only convert back to the original type. But it is harder to make it *real*. – Serge Ballesta Jul 21 '16 at 21:24
  • 1
    The part about the `union` in the end is questionable. The only seemingly related thing I find in the standard is C11 draft standard n1570 *6.2.5 Types 28 A pointer to void shall have the same representation and alignment requirements as a pointer to a character type. [...] All pointers to structure [and respecively] union types shall have the same representation and alignment requirements as each other. Pointers to other types need not have the same representation or alignment requirements.* , but this is talking about the alignment *of the pointer itself*, not of the type *pointed to*. – EOF Jul 22 '16 at 00:37
  • I agree, alignment is important, but since C allows for alignment to be the same for all types, it ultimately comes down to strict aliasing. – 2501 Jul 22 '16 at 07:21
  • @EOF: After reading your comment, I agree with you: I had misunderstood paragraph 6.2.5 – Serge Ballesta Jul 22 '16 at 08:57
-1

If one reads the rationale of the C89 Standard, the only reason that the type- aliasing rules exist is to avoid requiring compilers to make "worst-case aliasing assumptions". The given example was:

    int a;
    void f( double * b )
    {
        a = 1;
        *b = 2.0;
        g(a);
    }

If program creates a "char" array within a union containing something whose alignment would be suitable for any type, takes the address thereof, and never accesses the storage of that structure except through the resulting pointer, there should be no reason why aliasing rules should cause any difficulty.

It's worthwhile to note that the authors of the Standard recognized that an implementation could be simultaneously compliant but useless; see the rationale for C89 2.2.4.1:

While a deficient implementation could probably contrive a program that meets this requirement, yet still succeed in being useless, the Committee felt that such ingenuity would probably require more work than making something useful. The sense of the Committee is that implementors should not construe the translation limits as the values of hard-wired parameters, but rather as a set of criteria by which an implementation will be judged.

While that particular statement is made with regard to implementation limits, the only way to interpret the C89 as being even remotely compatible with the C dialects that preceded it is to regard it as applying more broadly as well: the Standard doesn't try to exhaustively specify everything that a program should be able to do, but relies upon compiler writers' exercising some common sense.

Use of a character-type array as a backing store of any type, assuming one ensures alignment issues are taken care of, shouldn't cause any problems with a non-obtusely written compiler. The Standard didn't mandate that compiler writers allow such things because they saw no reason to expect them to do otherwise. Unfortunately, they failed to foresee the path the language would take in the 21st Century.

supercat
  • 77,689
  • 9
  • 166
  • 211
  • 1
    That citation is entirely out of context. The problem with the implementation limits is not even remotely related to accessing objects though lvalues of types other than the effective type of the object. – EOF Jul 23 '16 at 00:57
  • @EOF: It proves that the authors of the Standard did intend to fully describe the behavior of a *quality* C implementation, and recognized that a C implementation could simultaneously be fully compliant and yet of such poor quality to be useless. The fact that a compliant C compiler is allowed to ignore cases in which aliasing is obvious does not mean that a C compiler which does so should not be considered grossly inferior to one that recognizes such aliasing. – supercat Jul 25 '16 at 14:12
  • s/did intend/didn't intend/ – supercat Jul 26 '16 at 20:00