4

Inspired by comments to my answer here.

Is this sequence of steps legal in C standard (C11)?

  1. Make an array of function pointers
  2. Take a pointer to the first entry and cast that pointer to function pointer to void*
  3. Perform pointer arithmetic on that void*
  4. Cast it back to pointer to function pointer and dereference it.

Or equivalently as code:

void foo(void) { ... }
void bar(void) { ... }

typedef void (*voidfunc)(void);
voidfunc array[] = {foo, bar}; // Step 1

void *ptr1 = array; // Step 2

void *ptr2 = (char*)ptr1 + sizeof(voidfunc); // Step 3

voidfunc bar_ptr = *(voidfunc*)ptr2; // Step 4

I thought that this would be allowed, as the actual function pointers are only accessed through properly typed pointer. But Andrew Henle pointed out that this doesn't seem to be covered by Standard section 6.3.2.3: Pointers.

curiousguy
  • 8,038
  • 2
  • 40
  • 58
jpa
  • 10,351
  • 1
  • 28
  • 45
  • 1
    Such confusion seems typical with typedefed pointers. Even to functions. Contrasting your code with `typedef void voidfunc(void); voidfunc* array[] = ...;` should point out exactly where any confusion may lay. – StoryTeller - Unslander Monica Oct 28 '18 at 17:01
  • @StoryTeller Hmm, nice, I didn't know you could typedef function types. It is indeed clearer than hiding the `*` inside the typedef. – jpa Oct 28 '18 at 17:08

3 Answers3

9

Your code is correct.

A pointer to a function is an object and you're casting a pointer to an object (a pointer to a function pointer) to void pointer and back again; and then finally dereferencing a pointer to an object.

As for the char pointer arithmetic, this is referred to by footnote 106 of C11:

106) Another way to approach pointer arithmetic is first to convert the pointer(s) to character pointer(s): In this scheme the integer expression added to or subtracted from the converted pointer is first multiplied by the size of the object originally pointed to, and the resulting pointer is converted back to the original type. For pointer subtraction, the result of the difference between the character pointers is similarly divided by the size of the object originally pointed to. When viewed in this way, an implementation need only provide one extra byte (which may overlap another object in the program) just after the end of the object in order to satisfy the ''one past the last element'' requirements.

7

Yes, the code is fine. There's various pitfalls and conversion rules at play here:

  • C splits all types in two main categories: objects and functions. A pointer to a function is a scalar type which in turn is an object. (C17 6.2.5)
  • void* is the generic pointer type for pointers to object type. Any pointer to object type may be converted to/from void*, implicitly. (C17 6.3.2.3 §1).
  • No such generic pointer type exists for pointers to function type. Thus a function pointer cannot be converted to a void* or vice versa. (C17 6.3.2.3 §1)
  • However, any function pointer type can be converted to another function pointer type and back, allowing us to use something like for example void(*)(void) as a generic function pointer type. As long as you don't call the function through the wrong function pointer type, it is fine. (C17 6.3.2.3 §8)

Function pointers point to functions, but they are objects in themselves, just like any pointer is. And so you can use a void* to point at the address of a function pointer.

Therefore, using a void* to point at a function pointer is fine. But not using it to point directly at a function. In case of void *ptr1 = array; the array decays into a pointer to the first element, a void (**)(void) (equivalent to voidfunc* in your example). You may point at such a pointer to function-pointer with a void*.

Furthermore, regarding pointer arithmetic:

  • No pointer arithmetic can be performed on a void*. (C17 6.3.2.2) Such arithmetic is a common non-standard extension that should be avoided. Instead, use a pointer to character type.
  • A pointer to character type may, as a special case, be used to iterate over any object (C17 6.2.3.3 §7). Apart from concerns regarding alignment, doing so is well-defined and does not violate "strict pointer aliasing", should you de-reference the character pointer (C17 6.5 §7).

Therefore, (char*)ptr1 + sizeof(voidfunc); is also fine. You then convert from void* to voidfunc*, to voidfunc which is the original function pointer type stored in the array.

As been noted in comments, you can improve readability of this code significantly by using a typedef to a function type:

typedef void (voidfunc)(void);

voidfunc* array[] = {&foo, &bar}; // Step 1
void* ptr1 = array; // Step 2
void* ptr2 = (char*)ptr1 + sizeof(voidfunc*); // Step 3
voidfunc* bar_ptr = *(voidfunc**)ptr2; // Step 4
Lundin
  • 195,001
  • 40
  • 254
  • 396
0

Pointer arithmetic on void* is not in the C language. You re not doing it though, you are doing pointer arithmetic on char* which is perfectly OK. You could have used char* instead of void* to begin with.

Andrew Helne seems to be missing the fact that a pointer to a function is an object, and its type is an object type. It is a plain simple fact, not something veiled in a shroud of mystery as some other commentators seem to imply. So his objection to casting a pointer to a function pointer is unfounded, as pointers to any object type can be cast to void*.

However, the C standard doesn't seem to allow using (T*)((char*)p + sizeof(T)) in lieu of (p+1) (where p is a pointer to an element of an array of type T), or at least I cannot find such permission in the text. Your code might not be legal because of that.

n. m. could be an AI
  • 112,515
  • 14
  • 128
  • 243
  • 1
    Could you expand on the reservation you express in your third paragraph? One does not need express "permission" to rely on C's semantics, and I think C semantics do define the desired behavior for the OP's (roundabout) approach, as long as the bounds of `array` are not exceeded. – John Bollinger Oct 28 '18 at 17:39
  • well, `(T*)((char*)p + sizeof(T))` is the only way that `qsort` could be implemented... – Antti Haapala -- Слава Україні Oct 28 '18 at 18:05
  • @JohnBollinger I cannot find where the standard defines behaviour of `(T*)((char*)p + sizeof(T))`. Note I do know that the expression `((char*)p + sizeof(T))` represents the address where the second element of the array resides. I don't see where the standard allows to access that element using a pointer obtained this way. If you can find any, please share references to relevant paragraphs. – n. m. could be an AI Oct 28 '18 at 18:11
  • @AnttiHaapala There is a whole lot of common practices that the standard apparently disallows. It could very well disallow you to implement your own `qsort`. Why not? What part of the standard allows it? – n. m. could be an AI Oct 28 '18 at 18:33
  • @n.m. well, first of all, you're allowed to iterate over bytes of an object. The question really is if you're allowed to iterate over *each byte of an array* with a pointer to an object cast to pointer to char. If not, then you cannot use your own `memcpy` instead of the standard library `memcpy` to copy arrays. – Antti Haapala -- Слава Україні Oct 28 '18 at 18:37
  • @AnttiHaapala *you're allowed to iterate over bytes*. The object you have a pointer to is the first element of the array, not the entire array. (I know the address is the same). So you can iterate over the bytes of `*p`. *then you cannot use your own `memcpy`* Again, I don't see why your own `memcpy` is allowed, just like your own `qsort`. There's no law that all or most of the standard library should be implementable in standard conforming C. Further, permission to iterate over bytes of an object doesn't mean you also have permission to access these bytes via an lvalue of any other type. – n. m. could be an AI Oct 28 '18 at 18:45
  • @n.m. indeed, and there was no access using an lvalue of any other type but the type of the object itself *and* the character type. – Antti Haapala -- Слава Україні Oct 28 '18 at 19:01
  • @AnttiHaapala *but the type of the object itself* yes but that's a pointer to a different object, not the one you have started with. I know there is a valid object at that address. I don't know what allows you to dereference that pointer. – n. m. could be an AI Oct 28 '18 at 19:03
  • @n.m., I don't see how the method by which the pointer is obtained makes a difference, unless you're positing that the OP's expression may produce a different value of that type than `array[1]` would do. The semantics of a value interpreted according to a given type do not depend on how the value was computed, as long as the value's computation itself has defined behavior. – John Bollinger Oct 28 '18 at 19:05
  • @n.m. http://port70.net/~nsz/c/c11/n1570.html#note106 even if non-normative it signifies intent. – Antti Haapala -- Слава Україні Oct 28 '18 at 19:14
  • @JohnBollinger *I don't see how the method by which the pointer is obtained makes a difference* This is crucial. Does the standard actually contain a language to this effect, or this is just one of those commonly believed things? Are you allowed to do things like `int a[2][2]; (&a[0][0])[2]=42;`? – n. m. could be an AI Oct 28 '18 at 19:17
  • @n.m. no you are not, that is because an array is accessed out of bounds. But there is nothing in the standard that says that an object that is an element in an array is also an array of length of 1 by itself. – Antti Haapala -- Слава Україні Oct 28 '18 at 19:19
  • @AnttiHaapala I agree that the intent is there, even without the footnote ;) – n. m. could be an AI Oct 28 '18 at 19:20
  • @AnttiHaapala If it doesn't matter how a pointer is obtained, then `(&a[0][0])[2]` has the same address as `a[1][0]` and everything is OK. OTOH id this is not allowed, the it does matter how a pointer is obtained. – n. m. could be an AI Oct 28 '18 at 19:23
  • @n.m. but *that* case is explicitly undefined behaviour that has been confirmed in committee responses to C89 defect reports almost 30 years ago. – Antti Haapala -- Слава Україні Oct 28 '18 at 19:26
  • @AnttiHaapala So it doesn't matter how a pointer is obtained, *except in that one special case*? I somehow doubt this claim. – n. m. could be an AI Oct 28 '18 at 19:28
  • Of course it matters how the pointer is obtained. You can never compatibly obtain a pointer by `char *` arithmetic where you would go out of bounds by effective-type arithmetic. But here everything happens in the confines of the `voidfunc array[2]`, that you can get from `&array[0]` to `&array[1]` by just that or casting to `void *` to `char *` and adding `sizeof (voidfunc)`, and then to `voidfunc *` - that's nothing that you couldn't do without the casts already. – Antti Haapala -- Слава Україні Oct 28 '18 at 19:32
  • @AnttiHaapala yes, if *actually* (as opposed to conceptually) doing pointer arithmetic as described in the footnote is legal, then the code in the question is legal too, because it is doing exactly what is described in the footnote. – n. m. could be an AI Oct 28 '18 at 19:38
  • @n.m., the standard's specifications for the results and side effects of operations are contingent only on the values of the operands. We know this not because the standard says it in so many words, but simply because it does not specify any other contingencies. – John Bollinger Oct 28 '18 at 19:55
  • @JohnBollinger I'm asking whether some value computation has defined behaviour. There are two dubious operations. First is the conversion of the incremented `char*` value to `voidfunc*`. Why is it defined? (Quotes please; I have already seen the footnote, is there normative language?) Assuming this *is* allowed, the next question is about dereferencing the resulting pointer. Why is this defined? When we establish validity of these two operations, we can talk about the value they produce (but there's no need because it's clear what it *would* be). – n. m. could be an AI Oct 28 '18 at 20:25
  • @n.m., the cast is allowed by [paragraph 6.3.2.3/7](https://port70.net/~nsz/c/c11/n1570.html#6.3.2.3p7). You've already stipulated that the `char *` points at `array[1]`, so we know that it is acceptably aligned for a a `void *`. – John Bollinger Oct 28 '18 at 20:32
  • @n.m., the behavior of dereferencing is specified by [paragraph 6.5.3.2/4](https://port70.net/~nsz/c/c11/n1570.html#6.5.3.2p4). The only way out of this is the out I offered you at the beginning: the proposition that the `void **) value obtained via the OP's route is different from the value of `array + 1`. – John Bollinger Oct 28 '18 at 20:35
  • @JohnBollinger yes the cast is allowed, I have found that paragraph. The second question remains. *the char * points at array[1]* no it does not, it merely represents the same address as that of `array[1]`. For example, in `int a[2][2];` the pointer `a[0]+2` compares equal to `a[1]+0` as they represent the same address, however one can dereference the latter but not the former. So in your sense they are different values. – n. m. could be an AI Oct 28 '18 at 20:41
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/182686/discussion-between-n-m-and-john-bollinger). – n. m. could be an AI Oct 28 '18 at 20:43
  • Sorry, @n.m., my previous response was made on the way out the door. Yes, it is more correct to say that the `char *` represents the same address as that of `array[1]`. So my question, then, is whether you are suggesting that the result of converting that pointer to type `void *` and from there to `voidfunc*` may point to a *different* address than that of `array[1]`? If so, then I agree that the standard does not appear to forbid such a result, though I don't think it's within the standard's intent. – John Bollinger Oct 28 '18 at 22:41
  • But I reject any interpretation allowing that if at some point in a program I have two pointer expressions of the same type and with defined behavior, such that `(e1) == (e2)` is true, and `e1` evaluates to a valid pointer to an object or function, then it might nevertheless not be safe to use `(e2)` interchangeably with `(e1)` while their equality holds. Such an interpretation is not plausible. – John Bollinger Oct 28 '18 at 23:09
  • @JohnBollinger The address is the same. The pointers compare equal. But they are not necessarily interchangeable. I believe DR17 has established that many years ago. See the link above. – n. m. could be an AI Oct 29 '18 at 05:15