1

I am going through this code for learning purposes and have a question about this line:

return (char*)desc + sizeof *desc;

Why is desc cast to char*? I tried to mimic it with my own code:

#include <stdio.h>
#include <stdlib.h>

struct Test {
    int value;
};

int main() {

    struct Test* test = malloc(sizeof test);

    struct Test* test1 = (void*)test + sizeof *test;
    test1->value = 1;
    printf("%d\n", test1->value);

    struct Test* test2 = (void*)test1 + sizeof *test;
    test2->value = 10;
    printf("%d\n", test2->value);

}

This also works. But, what is the difference? Why is char* used?

Note: I used void* just to see if that works. As char* has nothing to do with the struct in question, I simply thought, "What if I use void* over there?". A more specific question could be, why not int* or float* & why char* is used?

explorer
  • 944
  • 8
  • 18
  • 4
    Standard C doesn't allow arithmetic on `void` pointers. You're probably using a nonstandard extension, e.g. gcc supports this. – Nate Eldredge Jul 04 '20 at 04:28
  • 3
    Contrary to your beliefs, your code doesn't actually work. Remember that for any pointer or array `p` and index `i`, the expression `*(p + i)` is exactly equal to `p[i]`. That means you make `test1` point to `test[1]` and `test2` point to `test[2]`. – Some programmer dude Jul 04 '20 at 04:29
  • 2
    Also note that `sizeof test` might be different from `sizeof *test`. – Some programmer dude Jul 04 '20 at 04:30
  • 1
    Indeed, you probably wanted that first line to have `malloc(3 * sizeof *test)`. – Nate Eldredge Jul 04 '20 at 04:31
  • @Someprogrammerdude, by saying "it works", I meant, the code got complied & I am able to see the logs printed with expected value, when I ran the executable. As NateEldredge mentioned, I am using gcc. – explorer Jul 04 '20 at 05:02
  • @NateEldredge, `3` because, `Test` initialized 3 times? If yes, I got that. I was exploring that code to understand how the array is dynamically resized. – explorer Jul 04 '20 at 05:09
  • @NateEldredge, `Standard C doesn't allow arithmetic on void pointers`, okay! But, could you explain why `char*` is used? Why not `int*` or any other pointer? – explorer Jul 04 '20 at 05:11
  • 1
    Pointer arithmetic is done in the size of the base-type. So for `(char*)desc + sizeof *desc;` the base-type is `char` so it add `sizeof *desc * sizeof(char)` *bytes* to the pointer. If using `int` (as in `(int*)desc + sizeof *desc;`) then the base type is `int` so it adds `sizeof *desc + sizeof(int)` *bytes*, and with the common size of `int` being four bytes, it add four times as many bytes compared to the `char*` version. – Some programmer dude Jul 04 '20 at 05:15
  • 1
    @NateEldredge: The C standard does not **define** arithmetic on `void` pointers. It does **allow** arithmetic on `void` pointers. The C standard allows and invites extensions to the language, and GCC and Clang implementation such an extension. – Eric Postpischil Jul 04 '20 at 10:45
  • The code will also compile if you replace all numbers with twice the value. That does not tell anything whether it is correct or "works". – Gerhardh Jul 04 '20 at 11:01
  • 2
    @explorer, because you end up writing to three different `struct Test` objects, located one after the other. So if the buffer is smaller than `3 * sizeof(struct Test)`, you are writing off the end of it. The other point that Some programmer dude was making is that `sizeof test` is the size of a *pointer* to `struct Test`, because that's the type of `test`; it is not the size of a `struct Test` object. So that's why it should be `3 * sizeof *test`, or maybe `3 * sizeof(struct Test)`, but not `3 * sizeof test`. – Nate Eldredge Jul 04 '20 at 11:07
  • Does this answer your question? [Pointer arithmetic for void pointer in C](https://stackoverflow.com/questions/3523145/pointer-arithmetic-for-void-pointer-in-c) – imz -- Ivan Zakharyaschev Apr 28 '23 at 18:20

1 Answers1

3

Why Pointers to Characters Are Used

Why is desc cast to char*? … As char* has nothing to do with the struct in question…

In C, every object except a bit-field is composed of a sequence of bytes.1 Converting the address of an object to char * yields a pointer to the first byte of the object, and you can access the individual bytes of the object using that pointer.

In standard C, pointer arithmetic uses units of the pointed-to type. For a pointer p of type struct Test, p+1 points to the next structure after p, p+2 points to the structure after that, and so on. For a pointer q of type char *, q+1 points to the next char after q, q+2 points to char after that, and so on.

Thus, to access the individual bytes of an object, you can convert its address to a char * and use that.

Why Other Kinds of Pointers Are Not Used

A more specific question could be, why not int* or float* & why char* is used?

char * is used because all objects in C, except bit-fields, are defined to be represented as sequences of bytes. They are not necessarily sequences of int or float. unsigned char * and signed char * may also be used, and unsigned char * may be preferable due to complications from sign issues.

The C standard has special rules about accessing objects using character pointers, so it guarantees that accessing the bytes of an object this way will work. In contrast, accessing objects using int * or float * may not work. The compiler is allowed to expect that a pointer to an int will not be used to access a float object, and, when it is generating machine instructions for a program, it may write those instructions based on that expectation. Using a character pointer prevents the compiler from assuming that a char * does not point to the same place as another kind of pointer.

Why Pointers to Void Work

Note: I used void* just to see if that works.

For pointer arithmetic to work, the compiler needs to know the size of the pointed-to object. When 1 is added to a pointer to a struct Test, the compiler needs to know how many bytes to adjust the internal address by.

void is an incomplete type. There are no void objects, and void has no size. (The size is not zero. There is no size.) Because of this, the C standard does not define any meaning for p+1 when p is a void *.

However, GCC defines arithmetic on void * as an extension. It works as if void had a size of 1 byte. Clang supports this too.

Because of this extension, doing arithmetic with void * pointers is essentially the same as doing arithmetic with char * pointers.

This extension is unnecessary; any code doing arithmetic on void * could be rewritten to use char * instead. Sometimes this requires extra casts to convert pointer types, and that could be the reason the extension was added to GCC (to reduce the amount of code required and make it look better).

You can disable this extension with the switches -Werror -Wpointer-arith, or you can generally request closer conformance to standard C with -Werror -std=c18 -pedantic. I used -Werror -std=c18 -pedantic whenever possible, and I recommend it.

Footnote

1 Bit-fields are sequences of bits that are held in some larger container of bytes and may happen to coincide with bytes.

Eric Postpischil
  • 195,579
  • 13
  • 168
  • 312