2

I came across one simple (maybe over simplified) implementation of the sizeof operator in C, which goes as follows:

#include <stdio.h> 

#define mySizeof(type) ((char*)(&type + 1) - (char*)(&type))

int main() {
    char x;
    int y;
    double z;
    printf("mySizeof(char)   is : %ld\n", mySizeof(x));
    printf("mySizeof(int)    is : %ld\n", mySizeof(y));
    printf("mySizeof(double) is : %ld\n", mySizeof(z));
}

Note: Please ignore whether this simple function can work in all cases; that's not the purpose of this post (though it works for the three cases defined in the program).

My question is: How does it work? (Especially the char* casting part.)

I did some investigations as follows:


#include <stdio.h>

#define Address(x) (&x)
#define NextAddress(x) (&x + 1)

int main() {
    int n = 1;
    printf("address is : %lld\n", Address(n));
    printf("next address is : %lld\n", NextAddress(n));
    printf("size is %lld\n", NextAddress(n) - Address(n));
    return 0;
}

The above sample program outputs:

address is : 140721498241924
next address is : 140721498241928
size is 1

I can see the addresses of &x and &x + 1. Notice that the difference is 4, which means 4 bytes, since the variable is int type. But, when I do the subtraction operation, the result is 1.

Adrian Mole
  • 49,934
  • 160
  • 51
  • 83
Chris Bao
  • 2,418
  • 8
  • 35
  • 62
  • Does this answer your question? [Implementation of sizeof operator](https://stackoverflow.com/questions/14171117/implementation-of-sizeof-operator) – JASLP doesn't support the IES Jul 14 '22 at 05:12
  • 1
    This is just how C pointer arithmetic works, see https://stackoverflow.com/questions/394767/pointer-arithmetic. Subtracting two pointers gives an answer in units of *the type pointed to*. The addresses `140721498241924` and `140721498241928` differ by 4 bytes, so if they are treated as `int *`, they differ by 1 `int`, and subtracting yields 1. If they are `char *`, they differ by 4 chars, and subtracting gives 4. – Nate Eldredge Jul 14 '22 at 05:13
  • 1
    Side note: use `%p` with a cast to `void *` for printing pointers. – JASLP doesn't support the IES Jul 14 '22 at 05:13
  • `%ld` is not specified to be the right specifier for `((char*)(&type + 1) - (char*)(&type))`. To mimic `sizeof`, use a cast: `((size_t) ((char*)(&type + 1) - (char*)(&type)))` and then print with `"%zu"`. – chux - Reinstate Monica Jul 14 '22 at 11:48

3 Answers3

2

What you have to remember here is that pointer arithmetic is performed in units of the size of the pointed-to type.

So, if p is a pointer to the first element of an int array, then *p refers to that first element and the result of the p + 1 operation will be the address resulting from adding the size of an int to the address in p; thus, *(p + 1) will refer to the second element of the array, as it should.

In your mySizeof macro, the &type + 1 expression will yield the result of adding the size of the relevant type to the address of type; so, in order for the subsequent subtraction of &type to yield the size in bytes, we cast the pointers to char*, so that the subtraction will be performed in base units of the size of a char … which is guaranteed by the C Standard to be 1 byte.

Adrian Mole
  • 49,934
  • 160
  • 51
  • 83
0

Pointers carry the information about their type. If you have a pointer to a 4-byte value such is int, and add 1 to it, you get a pointer to the next int, not a pointer to the second byte of the original int. Similarly for subtraction.

If you want to obtain the item size in bytes, it's necessary to force pointers to point to byte-like items. Hence the typecast to char*.

See also Pointer Arithmetic

dlask
  • 8,776
  • 1
  • 26
  • 30
0

Your implementation of sizeof works for most objects, albeit you should modify it this way:

  • the misnamed macro argument type (which cannot be a type) should be bracketed in the expansion to avoid operator precedence issues.
  • the expression has type ptrdiff_t, it should be cast as size_t
  • the printf format for size_t is %zu. Note that %ld is incorrect for ptrdiff_t, you should use %td for this.

Here is a modified version:

#include <stdio.h> 

#define mySizeof(obj) ((size_t)((char *)(&(obj) + 1) - (char *)&(obj)))

int main() {
    char x;
    int y;
    double z;
    printf("mySizeof(char)   is : %zu\n", mySizeof(x));
    printf("mySizeof(int)    is : %zu\n", mySizeof(y));
    printf("mySizeof(double) is : %zu\n", mySizeof(z));
    return 0;
}

How it works:

  • valid pointers can point to an element of an array or the the element just past the last element of the array. Objects that are not arrays are considered as arrays of 1 element for this purpose.
  • so if obj is a valid lvalue &(obj) + 1 is a valid pointer past the end of obj in memory and casting it as (char *) is valid.
  • similarly (char *)&(obj) is a valid pointer to the beginning of the object, and the only iffy operation here is the subtraction of 2 valid pointers that cannot be considered to point to the same array of char.
  • the C standard make a special case of character type pointers to allow the representation of objects to be accessed as individual bytes. So (char *)(&(obj) + 1) - (char *)&(obj) effectively evaluates to the number of bytes in the representation of obj.

Note these limitations for this implementation of sizeof:

  • it does not work for types as in mySizeof(int)
  • the argument must be an object: mySizeof(1) does not work, nor mySizeof(x + 1)
  • the object may be struct or an array: char foo[3]; mySizeof(foo) but not a string literal: mySizeof("abc") nor a compound literal: mySizeof((char[2]){'a','b'})
chqrlie
  • 131,814
  • 10
  • 121
  • 189