23

Is there a compile-time way to detect / prevent duplicate values within a C/C++ enumeration?

The catch is that there are multiple items which are initialized to explicit values.

Background:

I've inherited some C code such as the following:

#define BASE1_VAL    (5)
#define BASE2_VAL    (7)

typedef enum
{
  MsgFoo1A = BASE1_VAL,       // 5
  MsgFoo1B,                   // 6
  MsgFoo1C,                   // 7
  MsgFoo1D,                   // 8
  MsgFoo1E,                   // 9
  MsgFoo2A = BASE2_VAL,       // Uh oh!  7 again...
  MsgFoo2B                    // Uh oh!  8 again...
} FOO;

The problem is that as the code grows & as developers add more messages to the MsgFoo1x group, eventually it overruns BASE2_VAL.

This code will eventually be migrated to C++, so if there is a C++-only solution (template magic?), that's OK -- but a solution that works with C and C++ is better.

phuclv
  • 37,963
  • 15
  • 156
  • 475
Dan
  • 10,303
  • 5
  • 36
  • 53

8 Answers8

14

There are a couple ways to check this compile time, but they might not always work for you. Start by inserting a "marker" enum value right before MsgFoo2A.

typedef enum
{
    MsgFoo1A = BASE1_VAL,
    MsgFoo1B,
    MsgFoo1C,
    MsgFoo1D,
    MsgFoo1E,
    MARKER_1_DONT_USE, /* Don't use this value, but leave it here.  */
    MsgFoo2A = BASE2_VAL,
    MsgFoo2B
} FOO;

Now we need a way to ensure that MARKER_1_DONT_USE < BASE2_VAL at compile-time. There are two common techiques.

Negative size arrays

It is an error to declare an array with negative size. This looks a little ugly, but it works.

extern int IGNORE_ENUM_CHECK[MARKER_1_DONT_USE > BASE2_VAL ? -1 : 1];

Almost every compiler ever written will generate an error if MARKER_1_DONT_USE is greater than BASE_2_VAL. GCC spits out:

test.c:16: error: size of array ‘IGNORE_ENUM_CHECK’ is negative

Static assertions

If your compiler supports C11, you can use _Static_assert. Support for C11 is not ubiquitous, but your compiler may support _Static_assert anyway, especially since the corresponding feature in C++ is widely supported.

_Static_assert(MARKER_1_DONT_USE < BASE2_VAL, "Enum values overlap.");

GCC spits out the following message:

test.c:16:1: error: static assertion failed: "Enum values overlap."
 _Static_assert(MARKER_1_DONT_USE < BASE2_VAL, "Enum values overlap.");
 ^
Dietrich Epp
  • 205,541
  • 37
  • 345
  • 415
  • 10
    It doesn't use the preprocessor, but it is an evil hack nonetheless. – Dietrich Epp Apr 05 '10 at 04:30
  • All useful answers here - accepted this one because it'll work in C as well as C++ (lots of FUD to overcome in the group regarding C++ still), and it seems to be the most likely to be understood & followed by the group. (I didn't show it in the example, but there are multiple items initialized within the enum - probably about 15 right now. Each time a new range/group is added, another "check array" will need to be added. Anyone with a pulse should see the pattern & add a new check array for the new group). – Dan Apr 05 '10 at 19:49
7

I didn't see "pretty" in your requirements, so I submit this solution implemented using the Boost Preprocessor library.

As an up-front disclaimer, I haven't used Boost.Preprocessor a whole lot and I've only tested this with the test cases presented here, so there could be bugs, and there may be an easier, cleaner way to do this. I certainly welcome comments, corrections, suggestions, insults, etc.

Here we go:

#include <boost/preprocessor.hpp>

#define EXPAND_ENUM_VALUE(r, data, i, elem)                          \
    BOOST_PP_SEQ_ELEM(0, elem)                                       \
    BOOST_PP_IIF(                                                    \
        BOOST_PP_EQUAL(BOOST_PP_SEQ_SIZE(elem), 2),                  \
        = BOOST_PP_SEQ_ELEM(1, elem),                                \
        BOOST_PP_EMPTY())                                            \
    BOOST_PP_COMMA_IF(BOOST_PP_NOT_EQUAL(data, BOOST_PP_ADD(i, 1)))

#define ADD_CASE_FOR_ENUM_VALUE(r, data, elem) \
    case BOOST_PP_SEQ_ELEM(0, elem) : break;

#define DEFINE_UNIQUE_ENUM(name, values)                                  \
enum name                                                                 \
{                                                                         \
    BOOST_PP_SEQ_FOR_EACH_I(EXPAND_ENUM_VALUE,                            \
                            BOOST_PP_SEQ_SIZE(values), values)            \
};                                                                        \
                                                                          \
namespace detail                                                          \
{                                                                         \
    void UniqueEnumSanityCheck##name()                                    \
    {                                                                     \
        switch (name())                                                   \
        {                                                                 \
            BOOST_PP_SEQ_FOR_EACH(ADD_CASE_FOR_ENUM_VALUE, name, values)  \
        }                                                                 \
    }                                                                     \
}

We can then use it like so:

DEFINE_UNIQUE_ENUM(DayOfWeek, ((Monday)    (1))
                              ((Tuesday)   (2))
                              ((Wednesday)    )
                              ((Thursday)  (4)))

The enumerator value is optional; this code generates an enumeration equivalent to:

enum DayOfWeek
{
    Monday = 1,
    Tuesday = 2,
    Wednesday,
    Thursday = 4
};

It also generates a sanity-check function that contains a switch statement as described in Ben Voigt's answer. If we change the enumeration declaration such that we have non-unique enumerator values, e.g.,

DEFINE_UNIQUE_ENUM(DayOfWeek, ((Monday)    (1))
                              ((Tuesday)   (2))
                              ((Wednesday)    )
                              ((Thursday)  (1)))

it will not compile (Visual C++ reports the expected error C2196: case value '1' already used).

Thanks also to Matthieu M., whose answer to another question got me interested in the Boost Preprocessor library.

Community
  • 1
  • 1
James McNellis
  • 348,265
  • 75
  • 913
  • 977
  • Looks great. Maybe move the check into a `detail` namespace to keep it out of the way. EDIT: <3. – GManNickG Apr 05 '10 at 05:52
  • @GMan: Now that I did that, though... doesn't the Boost.Preprocessor library work with C? (I'd presume that it would.) If so, maybe not using `namespace` would be better so it works for both languages. Eh... I'll think about that in the morning. – James McNellis Apr 05 '10 at 05:59
  • @James: Hm, good point. I'm pretty sure Boost.PP (is the only Boost library that) works in C. In C, a prefix of `detail_` would work well enough. You could even use `__cplusplus` to choose, so the same header works on both. (Generating different sources obviously.) – GManNickG Apr 05 '10 at 06:05
  • Woo great :) A pity you could not find a better way to define the value rather than embedding a sequence in a sequence, I'm sure there exists a way to do it with a simple comma, but I'm afraid that would require doing without the help of Boost PP and I haven't had the courage to launch myself into it yet :) As for the switch, may I suggest you write something like a `toString` method ? Instead of being a pure compiler guard we could after all use the method for real work :) */me feels very proud that my response got cited* – Matthieu M. Apr 05 '10 at 11:26
  • @Matthieu, the [ENUM macro](http://stackoverflow.com/questions/300592/enum-in-c-like-enum-in-ada/1436615#1436615) uses `=` to define an initial value. – Johannes Schaub - litb Apr 05 '10 at 14:53
  • Absolutely right -- "pretty" definitely takes a backseat to robust, correct, and easy to use. And ultimately, the DEFINE_UNIQUE_ENUM() macro hides the complexity. Kudos. – Dan Apr 05 '10 at 19:37
  • @litb: the problem with the use of `=` is that you cannot use the enumerated values for anything else afterward. In that case for example, you could not use the `switch` trick. – Matthieu M. Apr 06 '10 at 06:33
3

I don't believe there's a way to detect this with the language itself, considering there are conceivable cases where you'd want two enumeration values to be the same. You can, however, always ensure all explicitly set items are at the top of the list:

typedef enum
{
  MsgFoo1A = BASE1_VAL,       // 5
  MsgFoo2A = BASE2_VAL,       // 7
  MsgFoo1B,                   // 8
  MsgFoo1C,                   // 9
  MsgFoo1D,                   // 10
  MsgFoo1E,                   // 11
  MsgFoo2B                    // 12
} FOO;

So long as assigned values are at the top, no collision is possible, unless for some reason the macros expand to values which are the same.

Usually this problem is overcome by giving a fixed number of bits for each MsgFooX group, and ensuring each group does not overflow it's allotted number of bits. The "Number of bits" solution is nice because it allows a bitwise test to determine to which message group something belongs. But there's no built-in language feature to do this because there are legitimate cases for an enum having two of the same value:

typedef enum
{
    gray = 4, //Gr[ae]y should be the same
    grey = 4,
    color = 5, //Also makes sense in some cases
    couleur = 5
} FOO;
Billy ONeal
  • 104,103
  • 58
  • 317
  • 552
  • Sorry, I should have mentioned that the enums should stay ordered consecutively, i.e. MsgFoo1B should be identical to (MsgFoo1A + 1) - my bad. – Dan Apr 05 '10 at 19:40
3

I don't know of anything that will automatically check all enum members, but if you want to check that future changes to the initializers (or the macros they rely on) don't cause collisions:

switch (0) {
    case MsgFoo1A: break;
    case MsgFoo1B: break;
    case MsgFoo1C: break;
    case MsgFoo1D: break;
    case MsgFoo1E: break;
    case MsgFoo2A: break;
    case MsgFoo2B: break;
}

will cause a compiler error if any of the integral values is reused, and most compilers will even tell you what value (the numeric value) was a problem.

Ben Voigt
  • 277,958
  • 43
  • 419
  • 720
1

While we do not have full on reflection, you can solve this problem if you can relist the enumeration values.

Somewhere this is declared:

enum E { A = 0, B = 0 };

elsewhere, we build this machinery:

template<typename S, S s0, S... s>
struct first_not_same_as_rest : std::true_type {};
template<typename S, S s0, S s1, S... s>
struct first_not_same_as_rest : std::integral_constant< bool,
  (s0 != s1) && first_not_same_as_rest< S, s0, s... >::value
> {};


template<typename S, S... s>
struct is_distinct : std::true_type {};

template<typename S, S s0, S... s>
struct is_distinct : std::integral_constant< bool,
  std::is_distinct<S, s...>::value &&
  first_not_same_as_rest< S, s0, s... >::value
> {};

Once you have that machinery (which requires C++11), we can do the following:

static_assert( is_distinct< E, A, B >::value, "duplicate values in E detected" );

and at compile time we will ensure that no two elements are equal.

This requires O(n) recursion depth and O(n^2) work by the compiler at compile time, so for extremely large enums this could cause problems. A O(lg(n)) depth and O(n lg(n)) work with a much larger constant factor can be done by sorting the list of elements first, but that is much, much more work.

With the enum reflection code proposed for C++1y-C++17, this will be doable without relisting the elements.

Yakk - Adam Nevraumont
  • 262,606
  • 27
  • 330
  • 524
1

You could roll a more robust solution of defining enums using Boost.Preprocessor - wether its worth the time is a different matter.

If you are moving to C++ anyway, maybe the (proposed) Boost.Enum suits you (available via the Boost Vault).

Another approach might be to use something like gccxml (or more comfortably pygccxml) to identify candidates for manual inspection.

Georg Fritzsche
  • 97,545
  • 26
  • 194
  • 236
  • I wish I could upvote this (because I upvote almost anything mentioning boost) but I've not used this specific library before and cant :( Do you have an example which demonstrates how it might solve the OP's problem? – Billy ONeal Apr 05 '10 at 04:34
  • @Billy: I mostly only know what Boost.PP can do - i haven't had time to really look into it. The proposed Boost.Enum should be a good start though, using its example for generating the enum definition and also a suitable compile-time check shouldn't be too hard. – Georg Fritzsche Apr 05 '10 at 04:55
  • Makes me realize I'm not using Boost nearly enough, or at least nowhere near its functionality - thanks. – Dan Apr 05 '10 at 19:44
1

Here's a solution using X macro without Boost. First define the X macro and its helper macros. I'm using this solution to portably make 2 overloads for the X macro so that you can define the enum with or without an explicit value. If you're using GCC or Clang then it can be made shorter

#define COUNT_X_ARGS_IMPL2(_1, _2, count, ...) count
#define COUNT_X_ARGS_IMPL(args) COUNT_X_ARGS_IMPL2 args
#define COUNT_X_ARGS(...) COUNT_X_ARGS_IMPL((__VA_ARGS__, 2, 1, 0))

/* Pick the right X macro to invoke. */
#define X_CHOOSE_HELPER2(count) X##count
#define X_CHOOSE_HELPER1(count) X_CHOOSE_HELPER2(count)
#define X_CHOOSE_HELPER(count)  X_CHOOSE_HELPER1(count)

/* The actual macro. */
#define X_GLUE(x, y) x y
#define X(...) X_GLUE(X_CHOOSE_HELPER(COUNT_X_ARGS(__VA_ARGS__)), (__VA_ARGS__))

Then define the macro and check it

#define BASE1_VAL    (5)
#define BASE2_VAL    (7)

// Enum values
#define MY_ENUM             \
X(MsgFoo1A, BASE1_VAL)      \
X(MsgFoo1B)                 \
X(MsgFoo1C)                 \
X(MsgFoo1D)                 \
X(MsgFoo1E)                 \
X(MsgFoo2A, BASE2_VAL)      \
X(MsgFoo2B)

// Define the enum
#define X1(enum_name)               enum_name,
#define X2(enum_name, enum_value)   enum_name = enum_value,
enum foo
{
    MY_ENUM
};
#undef X1
#undef X2

// Check duplicates
#define X1(enum_name)               case enum_name: break;
#define X2(enum_name, enum_value)   case enum_name: break;
static void check_enum_duplicate()
{
    switch(0)
    {
        MY_ENUM
    }
}
#undef X1
#undef X2

Use it

int main()
{
// Do something with the whole enum
#define X1(enum_name)               printf("%s = %d\n", #enum_name, enum_name);
#define X2(enum_name, enum_value)   printf("%s = %d\n", #enum_name, enum_value);
    // Print the whole enum
    MY_ENUM
#undef X1
#undef X2
}
phuclv
  • 37,963
  • 15
  • 156
  • 475
  • 1
    Thanks, this is quite clever and your example was very helpful. I have to move between C and C++ (embedded systems development) so this approach is quite useful. Thank you! – Dan Apr 16 '20 at 17:27
-1

I didn't completely like any of the answers already posted here, but they gave me some ideas. The crucial technique is to rely on Ben Voight's answer of using a switch statement. If multiple cases in a switch share the same number, you'll get a compile error.

Most usefully to both myself and probably the original poster, this doesn't require any C++ features.

To clean things up, I used aaronps's answer at How can I avoid repeating myself when creating a C++ enum and a dependent data structure?

First, define this in some header someplace:

#define DEFINE_ENUM_VALUE(name, value)      name=value,
#define CHECK_ENUM_VALUE(name, value)       case name:
#define DEFINE_ENUM(enum_name, enum_values) \
    typedef enum { enum_values(DEFINE_ENUM_VALUE) } enum_name;
#define CHECK_ENUM(enum_name, enum_values) \
    void enum_name ## _test (void) { switch(0) { enum_values(CHECK_ENUM_VALUE); } }

Now, whenever you need to have an enumeration:

#define COLOR_VALUES(GEN) \
    GEN(Red, 1) \
    GEN(Green, 2) \
    GEN(Blue, 2)

Finally, these lines are required to actually make the enumeration:

DEFINE_ENUM(Color, COLOR_VALUES)
CHECK_ENUM(Color, COLOR_VALUES)

DEFINE_ENUM makes the enum data type itself. CHECK_ENUM makes a test function that switches on all the enum values. The compiler will crash when compiling CHECK_ENUM if you have duplicates.

Community
  • 1
  • 1
James Johnston
  • 9,264
  • 9
  • 48
  • 76
  • The compiler doesn't crash, it just throws an error. If it crashes then definitely it's a compiler bug. Also your `DEFINE_ENUM` requires one to **always specify the enum values** which is not the usual case and isn't always possible – phuclv Apr 15 '20 at 14:40