25

Intro

The enum type in C++ is fairly basic; it basically just creates a bunch of compile-time values for labels (potentially with proper scoping with enum class).

It's very attractive for grouping related compile-time constants together:

enum class Animal{
DOG, 
CAT,
COW,
...
};
// ...
Animal myAnimal = Animal::DOG;

However it has a variety of perceived shortcomings, including:

  • No standard way to get the number of possible elements
  • No iteration over elements
  • No easy association of enum with string

In this post, I seek to create a type that addresses those perceived shortcomings.

An ideal solution takes the notion of compile-time knowledge of constants and their associations with strings and groups them together into an scoped-enum-like object that is searchable both by enum id and enum string name. Finally, the resulting type would use syntax that is as close to enum syntax as possible.

In this post I'll first outline what others have attempted for the individual pieces, and then walk through two approaches, one that accomplishes the above but has undefined behavior due to order of initialization of static members, and another solution that has less-pretty syntax but no undefined behavior due to order of initialization.


Prior work

There are plenty of questions on SO about getting the number of items in an enum (1 2 3) and plenty of other questions on the web asking the same thing (4 5 6) etc. And the general consensus is that there's no sure-fire way to do it.

The N'th element trick

The following pattern only works if you enforce that enum values are positive and increasing:

enum Foo{A=0, B, C, D, FOOCOUNT}; // FOOCOUNT is 4

But is easily broken if you're trying to encode some sort of business logic that requires arbitrary values:

enum Foo{A=-1, B=120, C=42, D=6, FOOCOUNT}; // ????

Boost Enum

And so the developers at Boost attempted to solve the issue with Boost.Enum which uses some fairly complicated macros to expand into some code that will at least give you the size.

Iterable Enums

There have been a few attempts at iterable enums; enum-like objects that one can iterate over, theoretically allowing for implicit size computation, or even explicitly in the case of [7] (7 8 9, ...)

Enum to String conversion

Attempts to implement this usually result in free-floating functions and the use of macros to call them appropriately. (8 9 10)

This also covers searching enums by string (13)


Additional Constraints

  • No macros

    Yes, this means no Boost.Enum or similar approach

  • Need int->Enum and Enum-int Conversion

    A rather unique problem when you start moving away from actual enums;

  • Need to be able to find enum by int (or string)

    Also a problem one runs into when they move away from actual enums. The list of enums is considered a collection, and the user wants to interrogate it for specific, known-at-compile-time values. (See iterable enums and Enum to String conversion)

At this point it becomes pretty clear that we cannot really use an enum anymore. However, I'd still like an enum-like interface for the user

Approach

Let's say I think that I'm super clever and realize that if I have some class A:

struct A
{
   static int myInt;
};
int A::myInt;

Then I can access myInt by saying A::myInt.

Which is the same way I'd access an enum:

enum A{myInt};
// ...
// A::myInt

I say to myself: well I know all my enum values ahead of time, so an enum is basically like this:

struct MyEnum
{
    static const int A;
    static const int B;
    // ...
};

const int MyEnum::A = 0;
const int MyEnum::B = 1;
// ...

Next, I want to get fancier; let's address the constraint where we need std::string and int conversions:

struct EnumValue
{
    EnumValue(std::string _name): name(std::move(_name)), id(gid){++gid;}
    std::string name;
    int id;
    operator std::string() const
    {
       return name;
    }

    operator int() const
    {
       return id;
    }

    private:
        static int gid;
};

int EnumValue::gid = 0;

And then I can declare some containing class with static EnumValues:

MyEnum v1

class MyEnum
{
    public:
    static const EnumValue Alpha;
    static const EnumValue Beta;
    static const EnumValue Gamma;

};

const EnumValue MyEnum::Alpha = EnumValue("Alpha")
const EnumValue MyEnum::Beta  = EnumValue("Beta")
const EnumValue MyEnum::Gamma  = EnumValue("Gamma")

Great! That solves some of our constraints, but how about searching the collection? Hm, well if we now add a static container like unordered_map, then things get even cooler! Throw in some #defines to alleviate string typos, too:


MyEnum v2

#define ALPHA "Alpha"
#define BETA "Beta"
#define GAMMA "Gamma"
// ...

class MyEnum
{
    public:
    static const EnumValue& Alpha;
    static const EnumValue& Beta;
    static const EnumValue& Gamma;
    static const EnumValue& StringToEnumeration(std::string _in)
    {
        return enumerations.find(_in)->second;
    }

    static const EnumValue& IDToEnumeration(int _id)
    {
        auto iter = std::find_if(enumerations.cbegin(), enumerations.cend(), 
        [_id](const map_value_type& vt)
        { 
            return vt.second.id == _id;
        });
        return iter->second;
    }

    static const size_t size()
    {
        return enumerations.size();
    }

    private:
    typedef std::unordered_map<std::string, EnumValue>  map_type ;
    typedef map_type::value_type map_value_type ;
    static const map_type enumerations;
};


const std::unordered_map<std::string, EnumValue> MyEnum::enumerations =
{ 
    {ALPHA, EnumValue(ALPHA)}, 
    {BETA, EnumValue(BETA)},
    {GAMMA, EnumValue(GAMMA)}
};

const EnumValue& MyEnum::Alpha = enumerations.find(ALPHA)->second;
const EnumValue& MyEnum::Beta  = enumerations.find(BETA)->second;
const EnumValue& MyEnum::Gamma  = enumerations.find(GAMMA)->second;

Full working demo HERE!


Now I get the added benefit of searching the container of enums by name or id:

std::cout << MyEnum::StringToEnumeration(ALPHA).id << std::endl; //should give 0
std::cout << MyEnum::IDToEnumeration(0).name << std::endl; //should give "Alpha"

BUT

This all feels very wrong. We're initializing a LOT of static data. I mean, it wasn't until recently that we could populate a map at compile time! (11)

Then there's the issue of the static-initialization order fiasco:

A subtle way to crash your program.

The static initialization order fiasco is a very subtle and commonly misunderstood aspect of C++. Unfortunately it’s very hard to detect — the errors often occur before main() begins.

In short, suppose you have two static objects x and y which exist in separate source files, say x.cpp and y.cpp. Suppose further that the initialization for the y object (typically the y object’s constructor) calls some method on the x object.

That’s it. It’s that simple.

The tragedy is that you have a 50%-50% chance of dying. If the compilation unit for x.cpp happens to get initialized first, all is well. But if the compilation unit for y.cpp get initialized first, then y’s initialization will get run before x’s initialization, and you’re toast. E.g., y’s constructor could call a method on the x object, yet the x object hasn’t yet been constructed.

I hear they’re hiring down at McDonalds. Enjoy your new job flipping burgers.

If you think it’s “exciting” to play Russian Roulette with live rounds in half the chambers, you can stop reading here. On the other hand if you like to improve your chances of survival by preventing disasters in a systematic way, you probably want to read the next FAQ.

Note: The static initialization order fiasco can also, in some cases, apply to built-in/intrinsic types.

Which can be mediated with a getter function that initializes your static data and returns it (12):

Fred& GetFred()
{
  static Fred* ans = new Fred();
  return *ans;
}

But if I do that, now I have to call a function to initialize my static data, and I lose the pretty syntax you see above!

#Questions# So, now I finally get around to my questions:

  • Be honest, how bad is the above approach? In terms of initialization order safety and maintainability?
  • What kind of alternatives do I have that are still pretty for the end user?

EDIT

The comments on this post seem to indicate a strong preference for static accessor functions to get around the static order initialization problem:

 public:
    typedef std::unordered_map<std::string, EnumValue> map_type ;
    typedef map_type::value_type map_value_type ;

    static const map_type& Enumerations()
    {
        static map_type enumerations {
            {ALPHA, EnumValue(ALPHA)}, 
            {BETA, EnumValue(BETA)},
            {GAMMA, EnumValue(GAMMA)}
            };

        return enumerations;
    }

    static const EnumValue& Alpha()
    {
        return Enumerations().find(ALPHA)->second;
    }

    static const EnumValue& Beta()
    {
         return Enumerations().find(BETA)->second;
    }

    static const EnumValue& Gamma()
    {
        return Enumerations().find(GAMMA)->second;
    }

Full working demo v2 HERE

Questions

My Updated questions are as follows:

  • Is there another way around the static order initialization problem?
  • Is there a way to perhaps only use the accessor function to initialize the unordered_map, but still (safely) be able to access the "enum" values with enum-like syntax? e.g.:

    MyEnum::Enumerations()::Alpha

or

MyEnum::Alpha

Instead of what I currently have:

MyEnum::Alpha()

Regarding the bounty:

I believe an answer to this question will also solve the issues surrounding enums I've elaborated in the post (Enum is in quotes because the resulting type will not be an enum, but we want enum-like behavior):

  • getting the size of an "enum"
  • string to "enum" conversion
  • a searchable "enum".

Specifically, if we could do what I've already done, but somehow accomplish syntax that is enum-like while enforcing static initialization order, I think that would be acceptable

Community
  • 1
  • 1
AndyG
  • 39,700
  • 8
  • 109
  • 143
  • I appreciate the honesty in the downvotes. I just hope someone can provide enlightenment and how to maybe do better. – AndyG Jul 17 '15 at 00:33
  • 1
    By the way, you still can't populate a map at compile time :-S – Kerrek SB Jul 17 '15 at 00:34
  • 1
    You definitely want to fix the initialization order. Don't have a variable at namespace scope; instead have a function returning a reference to a block-local static map. – Kerrek SB Jul 17 '15 at 00:35
  • @KerrekSB: Thank you for chiming in. Is the SO answer I linked to wrt compile-time initialization incorrect. WRT to function returning ref to block-local static-map; I did address that around [12], is there no way around that? – AndyG Jul 17 '15 at 00:37
  • 3
    I'm voting to close this question as off-topic because if the code is already working, it should go to the [Code Review SE](http://codereview.stackexchange.com/) site. Otherwise it should go to the [MCVE](http://stackoverflow.com/help/mcve) bin as long not improved so far. – πάντα ῥεῖ Jul 17 '15 at 01:01
  • @πάνταῥεῖ: Thank you for the heads-up. I can certainly see that POV, however I did not consider this code to be "working" due to the resulting UB due to static-initialization order, although I suppose I already knew a solution for it and just hoped for something better. – AndyG Jul 17 '15 at 01:08
  • @AndyG Why not the good ole [_Scott Meyer's Singleton_](http://stackoverflow.com/questions/30557133/singleton-and-interface-implementation/30557174#30557174)? And also: If you use _Singleton_, have **one**, not many. – πάντα ῥεῖ Jul 17 '15 at 01:11
  • 5
    I don't see anything wrong with this question to deserve down votes or close votes. He's asking a perfectly legitimate question, and it's not asking for a code review either--please at least do the courtesy of reading the question before voting on it. – Jerry Coffin Jul 17 '15 at 01:12
  • @πάνταῥεῖ: Thank you, that would certainly solve the issue; I believe that's what KerrekSB was saying to do, as well as [12] in my post. Is there any way that I could enforce initialization at the namespace scope that is also safe wrt other compilation units? I really would like to avoid forcing the user to directly call a function to get the instance, however if that's the only (safe) way, then that's the only way. – AndyG Jul 17 '15 at 01:44
  • @AndyG: I can't quite trace the reference, but your map is not a block-local variable -- rather, it's a static class data member. I haven't looked in detail, but it seems error-prone to me. I would expect something like `struct X { static const mymap & get_mappings(); };` – Kerrek SB Jul 17 '15 at 08:04
  • @KerrekSB Great, let's go that route for a second; now we've solved the static order initialization fiasco for the map, but what about the enum values (`Alpha`, `Beta`) etc.? Am I forced to also provide static accessor functions for those too or is there some way around that so that I can still somehow have enum-like syntax (e.g. `MyEnum::Alpha`)? – AndyG Jul 17 '15 at 12:42
  • @KerrekSB: See my update for more about what I'm trying to ask. – AndyG Jul 17 '15 at 13:01
  • what about just writing some class with converting operators to int and std::string and just put few of them staticly in some class? then add some method "findByName" or so – David Haim Jul 19 '15 at 15:57
  • @DavidHaim: Thanks for the response, however I do not see how your suggestion differs from what I've already done. – AndyG Jul 19 '15 at 16:36
  • "The question is widely applicable to a large audience." Which one? There are like 5 different questions in here, none of them are clear. What is it that you're actually asking and can you rewrite the question to actually ask it? – Barry Jul 19 '15 at 23:16
  • @Barry I appreciate your concern. I tried to post at the end what my specific questions were, but I'll re-think how I can be more clear. In terms of wide application, I believe an answer to this question will also solve the issue of getting the size of an "enum", string to "enum" conversion, and a searchable "enum". (Enum is in quotes because the resulting type will not be an enum, but we want enum-like behavior). Specifically, if we could do what I've already done, but somehow accomplish syntax that is enum-like while enforcing static initialization order, I think that would be acceptable – AndyG Jul 20 '15 at 00:06
  • I think the v2 looks like a good approach. I've seen it done this way, or at least very similar, in practice. And I recently saw a way to initialize it automagically only on use (at run time). I believe it was in a text book written by Bjarne. Personally, never seen a reason for it and I've nearly always avoided enums except where there was no other way (aside the forbidden #define). I still always have been curious about this type of problem (more for the theory than the practice of it) – ydobonebi Jul 20 '15 at 03:49
  • @QuinnRoundy: thank you. If you can dig up the text book version and the auto init, please post it as an answer. – AndyG Jul 20 '15 at 11:50
  • Please bulk up your introduction a lot. For a question of this scale (this question constitutes an essay) it really needs to be a full paragraph that summarizes your thesis and question. "Wall of text" is just an admission that the post is poorly structured. – QuestionC Jul 20 '15 at 14:54
  • @QuestionC: See my update. – AndyG Jul 20 '15 at 15:15
  • Given the scope and difficulty, I think you may be able to motivate more people to design an answer if you award more points. (Mind you, I'm not trying to increase my profit or anything; in any case I will not attempt to answer the question, because I don't feel up to the task and I don't have time to study it in more detail.) – Julian Jul 23 '15 at 14:40
  • @Julian: I sympathize with the sentiment. I am not too desperate for an answer as I already have a working implementation that I would like to prettify and didn't know if it was possible in C++. Also, it was my first bounty offer and I was a little intimidated :-) – AndyG Jul 23 '15 at 14:57
  • @AndyG: understandable! – Julian Jul 23 '15 at 15:01

2 Answers2

16

Sometimes when you want to do something that isn't supported by the language, you should look external to the language to support it. In this case, code-generation seems like the best option.

Start with a file with your enumeration. I'll pick XML completely arbitrarily, but really any reasonable format is fine:

<enum name="MyEnum">
    <item name="ALPHA" />
    <item name="BETA" />
    <item name="GAMMA" />
</enum>

It's easy enough to add whatever optional fields you need in there (do you need a value? Should the enum be unscoped? Have a specified type?).

Then you write a code generator in the language of your choice that turns that file into a C++ header (or header/source) file a la:

enum class MyEnum {
    ALPHA,
    BETA,
    GAMMA,
};

std::string to_string(MyEnum e) {
    switch (e) {
    case MyEnum::ALPHA: return "ALPHA";
    case MyEnum::BETA: return "BETA";
    case MyEnum::GAMMA: return "GAMMA";
    }
}

MyEnum to_enum(const std::string& s) {
    static std::unordered_map<std::string, MyEnum> m{
        {"ALPHA", MyEnum::ALPHA},
        ...
    };

    auto it = m.find(s);
    if (it != m.end()) {
        return it->second;
    }
    else {
        /* up to you */
    }
}

The advantage of the code generation approach is that it's easy to generate whatever arbitrary complex code you want for your enums. Basically just side-step all the problems you're currently having.

Barry
  • 286,269
  • 29
  • 621
  • 977
  • 1
    The header generator could also generate a global constexpr std::initializer_list with all the enum values so you can iterate over them – KABoissonneault Jul 20 '15 at 17:03
  • Thanks for the answer, Barry (and the addendum, KABoissoneault). Sounds like this is the only safe way to get what I want. Seems that this could be done a bit easier if I were permitted to use macros, yes? Perhaps I could push hard to make an exception :-) – AndyG Jul 20 '15 at 19:05
  • @AndyG Code generation tops macros in lots of ways. The main ones being (1) you get a complete language to generate your code in, with all the advantages of that (try repeating something with a macro...) and (2) you end up with actual, readable C++ code (try *that* with a macro...) – Barry Jul 20 '15 at 19:11
  • @AndyG What barry shows here is something that is already available in other Tools. Take a look at [Qt's Enums](http://doc.qt.io/qt-5/qobject.html#Q_ENUM). You define your enums normally, add a `Q_ENUM(MyEnum)` somewhere in the headerfile and that's it. The MetaObjectCompiler will scan your headerfiles and will generate the required code. [QMetaEnum](http://doc.qt.io/qt-5/qmetaenum.html) provides you with Conversion Functions and everything else you need to know about your Enum. – Timo Jul 25 '15 at 15:55
  • +1 for easiest to maintain. I do this all over the place. Note also that OpenGL headers can be generated directly in this way, and is much easier than managing giant macro-only headers. – defube Jul 25 '15 at 20:30
3

I usually prefer non-macro code but in this case, I don't see what's wrong with macros.
IMHO, for this task macros are a much better fit as they are simpler and shorter to write and to read, and the same goes for the generated code. Simplicity is a goal in its own right.

These 2 macro calls:

#define Animal_Members(LAMBDA) \
    LAMBDA(DOG) \
    LAMBDA(CAT) \
    LAMBDA(COW) \

CREATE_ENUM(Animal,None);

Generate this:

struct Animal {
  enum Id {
    None,
    DOG,
    CAT,
    COW
  };
  static Id fromString( const char* s ) {
    if( !s ) return None;
    if( strcmp(s,"DOG")==0 ) return DOG;
    if( strcmp(s,"CAT")==0 ) return CAT;
    if( strcmp(s,"COW")==0 ) return COW;
    return None;
  }
  static const char* toString( Id id ) {
    switch( id ) {
      case DOG: return "DOG";
      case CAT: return "CAT";
      case COW: return "COW";
      default: return nullptr;
    }
  }
  static size_t count() {
    static Id all[] = { None, DOG, CAT, COW };
    return sizeof(all) / sizeof(Id);
  }
};

You could wrap them into a single macro using BOOST_PP and have a sequence for the members. This would make it a lot less readable, though.
You can easily change it to your preferences of default return values, or remove the default altogether, add a specific member value and string name, etc.
There's no loose functions, no init order hell, and only a bit of macro code that looks very much like the final result:

#define ENUM_MEMBER(MEMBER)                         \
    , MEMBER
#define ENUM_FROM_STRING(MEMBER)                    \
    if( strcmp(s,#MEMBER)==0 ) return MEMBER;
#define ENUM_TO_STRING(MEMBER)                      \
    case MEMBER: return #MEMBER;
#define CREATE_ENUM_1(NAME,MACRO,DEFAULT)           \
    struct NAME {                                   \
        enum Id {                                   \
            DEFAULT                                 \
            MACRO(ENUM_MEMBER)                      \
        };                                          \
        static Id fromString( const char* s ) {     \
            if( !s ) return DEFAULT;                \
            MACRO(ENUM_FROM_STRING)                 \
            return DEFAULT;                         \
        }                                           \
        static const char* toString( Id id ) {      \
            switch( id ) {                          \
            MACRO(ENUM_TO_STRING)                   \
            default: return nullptr;                \
            }                                       \
        }                                           \
        static size_t count() {                     \
            static Id all[] = { DEFAULT             \
                MACRO(ENUM_MEMBER) };               \
            return sizeof(all) / sizeof(Id);        \
        }                                           \
    };
#define CREATE_ENUM_2(NAME,DEFAULT) \
    CREATE_ENUM_1(NAME,NAME##_Members,DEFAULT)
#define CREATE_ENUM(NAME,DEFAULT) \
    CREATE_ENUM_2(NAME,DEFAULT)

Hope this helps.

BitWhistler
  • 1,439
  • 8
  • 12
  • Thank you for the response, BitWhistler. Believe it or not, but I'm generally disallowed from creating and using my own macros for debug reasons, but may be able to push for an exception in this case. – AndyG Jul 25 '15 at 13:52
  • 1
    In this case, please make sure the macro names are unique. You can add a prefix or something. Also, you can use the preprocessor to generate the code into a header file using -E and a perl one-liner to split lines on curly braces and semicolons, and to indent, to beautify the code. – BitWhistler Jul 25 '15 at 16:14