14

I encountered the following macro definition when reading the globals.h in the Google V8 project.

// The expression ARRAY_SIZE(a) is a compile-time constant of type
// size_t which represents the number of elements of the given
// array. You should only use ARRAY_SIZE on statically allocated
// arrays.

#define ARRAY_SIZE(a)                               \
  ((sizeof(a) / sizeof(*(a))) /                     \
  static_cast<size_t>(!(sizeof(a) % sizeof(*(a)))))

My question is the latter part: static_cast<size_t>(!(sizeof(a) % sizeof(*(a))))). One thing in my mind is the following: Since the latter part will always evaluates to 1, which is of type size_t, the whole expression will be promoted to size_t.

If this assumption is correct, then there comes another question: since the return type of sizeof operator is size_t, why is such a promotion necessary? What's the benefit of defining a macro in this way?

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Lei Mou
  • 2,562
  • 1
  • 21
  • 29
  • 3
    BTW the comment is bogus. It should say something to not to use it on *heap* allocated objects, namely to never pass a bare pointer to it. "Real" arrays that are allocated on the stack would perfectly work with it. – Jens Gustedt Nov 05 '11 at 08:35
  • 1
    I don't think I need to mention that such a macro is outdated? `template size_t array_size(T (&)[N]){ return N; }` – Xeo Nov 05 '11 at 09:02
  • @Xeo, though the `template` function is the good choice. There will always be a dependency on compiler optimization. After all, `array_size()` is a method which will be executed at runtime. May be in C++11, that method can be made `constexpr`; I haven't tested that. – iammilind Nov 05 '11 at 09:32
  • 1
    @iammilind: I would expect it can be made `constexpr`, however you can use another solution if you need to ensure compile-time evaluation: a templated struct holding the result in an enum. – Matthieu M. Nov 05 '11 at 14:59
  • @Matthieu, I doubt if it's possible to determine size just by giving name of the (non global) variable. In my knowledge, `sizeof()` is the only possible trick in C++03. – iammilind Nov 06 '11 at 04:52
  • @iammilind: read [my answer](http://stackoverflow.com/questions/8018843/macro-definition-array-size/8021113#8021113) ;) It's a bit more involved, but not much, and can be seen in action at [http://ideone.com/wwq96](http://ideone.com/wwq96). – Matthieu M. Nov 06 '11 at 11:21
  • I would like to expand on Xeo's comment: Since C++17 we have std::size, wich works for arrays: https://en.cppreference.com/w/cpp/iterator/size, so you don't have to maintain your own version of array_size. – smoothware Sep 04 '18 at 15:43

5 Answers5

12

As explained, this is a feeble (*) attempt to secure the macro against use with pointers (rather than true arrays) where it would not correctly assess the size of the array. This of course stems from the fact that macros are pure text-based manipulations and have no notion of AST.

Since the question is also tagged C++, I would like to point out that C++ offers a type-safe alternative: templates.

#ifdef __cplusplus
   template <size_t N> struct ArraySizeHelper { char _[N]; };

   template <typename T, size_t N>
   ArraySizeHelper<N> makeArraySizeHelper(T(&)[N]);

#  define ARRAY_SIZE(a)  sizeof(makeArraySizeHelper(a))
#else
#  // C definition as shown in Google's code
#endif

Alternatively, will soon be able to use constexpr:

template <typename T, size_t N>
constexpr size_t size(T (&)[N]) { return N; }

However my favorite compiler (Clang) still does not implement them :x

In both cases, because the function does not accept pointer parameters, you get a compile-time error if the type is not right.

(*) feeble in that it does not work for small objects where the size of the objects is a divisor of the size of a pointer.


Just a demonstration that it is a compile-time value:

template <size_t N> void print() { std::cout << N << "\n"; }

int main() {
  int a[5];
  print<ARRAY_SIZE(a)>();
}

See it in action on IDEONE.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Matthieu M.
  • 287,565
  • 48
  • 449
  • 722
  • I wish I can give more than +1 to this answer. Was looking for such alternative for a while in C++03. Will steal it from you. :) – iammilind Nov 06 '11 at 12:52
  • 1
    @iammilind: I am glad it's of use :) – Matthieu M. Nov 06 '11 at 13:06
  • @MatthieuM. Thank you! This alternative is more interesting than the one asked in quesion. :-) – Lei Mou Nov 07 '11 at 05:26
  • @LeiMou: Keep in mind that the one in the question is still the best I have seen when it comes to C. Google is known to have extensive code bases in both C and C++ (and probably Java, Python and Go as well). – Matthieu M. Nov 07 '11 at 07:20
  • On GCC 4.5.1 the contexpr template version does not compile. I get a "not of literal type" error. This is a shame as it would be much better than using a macro imo. Is this a bug in GCC or a valid restriction? – Ricky65 Nov 27 '11 at 01:37
  • @Ricky65: I am afraid I am not competent enough to answer, you should probably asks this as a separate question (and I would appreciate warning me if it turns out the code I suggest is incorrect). I can only tell that according to the [GCC C++11 Support Page](http://gcc.gnu.org/projects/cxx0x.html) *Generalized Constant Expressions* ([link](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2235.pdf)) are only supported starting from gcc 4.6 so it might be a compiler fluke. – Matthieu M. Nov 27 '11 at 12:59
  • I can confirm that on GCC 4.7 it does work :). Thanks. One wonders why the Standard Committee didn't add it to the standard library considering they added global begin() and end(). – Ricky65 Nov 27 '11 at 18:48
  • @Ricky65: indeed, but then there are *many* functions that you could with to have as free function (a `at` for arrays ?) and the committee did not add... but then, we have `std::array` now which does not suffer from all those issues. – Matthieu M. Nov 27 '11 at 19:50
  • @MatthieuM. Yes, I think they want coders to use std::array for all fixed size arrays in C++11. However, there is lots of legacy code using built-in C arrays which would benefit from global size(), maybe at() too. – Ricky65 Nov 28 '11 at 14:17
10

latter part will always evaluates to 1, which is of type size_t,

Ideally the later part will evaluate to bool (i.e. true/false) and using static_cast<>, it's converted to size_t.

why such promotion is necessary? What's the benefit of defining a macro in this way?

I don't know if this is ideal way to define a macro. However, one inspiration I find is in the comments: //You should only use ARRAY_SIZE on statically allocated arrays.

Suppose, if someone passes a pointer then it would fail for the struct (if it's greater than pointer size) data types.

struct S { int i,j,k,l };
S *p = new S[10];
ARRAY_SIZE(p); // compile time failure !

[Note: This technique may not show any error for int*, char* as said.]

iammilind
  • 68,093
  • 33
  • 169
  • 336
  • 2
    A slightly different explanation can be found here: http://stackoverflow.com/questions/1598773/is-there-a-standard-function-in-c-that-would-return-the-length-of-an-array/1598827#1598827 – rve Nov 05 '11 at 08:00
  • 1
    +1 for the notice at the bottom because I was thinking "wouldn't `int* p = new int[...]; ARRAY_SIZE(p);` pass the test?" – Marlon Nov 05 '11 at 08:58
9

If sizeof(a) / sizeof(*a) has some remainder (i.e. a is not an integral number of *a) then the expression would evaluate to 0 and the compiler would give you a division by zero error at compile time.

I can only assume the author of the macro was burned in the past by something that didn't pass that test.

Ben Jackson
  • 90,079
  • 9
  • 98
  • 150
  • This occurs in context where `a` is already a pointer, instead of an array (typically, when used on heap-allocated arrays). – Matthieu M. Nov 05 '11 at 15:01
5

In the Linux kernel, the macro is defined as (GCC specific):

#define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0]) + __must_be_array(arr))

where __must_be_array() is

/* &a[0] degrades to a pointer: a different type from an array */
#define __must_be_array(a) BUILD_BUG_ON_ZERO(__same_type((a), &(a)[0]))

and __same_type() is

#define __same_type(a, b) __builtin_types_compatible_p(typeof(a), typeof(b))
Omair
  • 814
  • 1
  • 10
  • 19
  • 2
    the last part (__must_be_array()) is interesting to compare. – Omair Dec 04 '13 at 04:52
  • BUILD_BUG_ON_ZERO definition according to kernel sources is `#define BUILD_BUG_ON_ZERO(e) (sizeof(struct { int:(-!!(e)); }))` – EFraim Feb 21 '18 at 08:04
2

The second part wants to ensure that the sizeof( a ) is divisible of by sizeof( *a ).

Thus the (sizeof(a) % sizeof(*(a))) part. If it's divisible, the expression will be evaluated to 0. Here comes the ! part - !(0) will give true. That's why the cast is needed. Actually, this does not affect the calculation of the size, just adds compile time check.

As it's compile time, in case that (sizeof(a) % sizeof(*(a))) is not 0, you'll have a compile-time error for 0 division.

Kiril Kirov
  • 37,467
  • 22
  • 115
  • 187