20

Let's say I have a number of strings I use often throughout my program (to store state and things like that). String operations can be expensive, so whenever addressing them I'd like to use an enumeration. I've seen a couple solutions so far:

typedef enum {
    STRING_HELLO = 0,
    STRING_WORLD
} string_enum_type;

// Must be in sync with string_enum_type
const char *string_enumerations[] = {
    "Hello",
    "World"
}

The other one I encounter quite often:

typedef enum {
    STRING_HELLO,
    STRING_WORLD
} string_enum_type;

const char *string_enumerations[] = {
    [STRING_HELLO] = "Hello",
    [STRING_WORLD] = "World"
}

What are cons/pros of these two methods? Is there a better one?

Claudio Cortese
  • 1,372
  • 2
  • 10
  • 21
PoVa
  • 995
  • 9
  • 24
  • 9
    The second, it's better as it's independent of the enum value. Which means that changes to the enum will not require you to match the indexing of the second – Omer Dagan Feb 26 '18 at 10:42
  • Yiu want to operate on state handles. I recommend using pointers to *static* state description data as state handles. A state description could be a string, but a struct that stores state name and other relevant info is probanly better. You only need to compare pointers, not strings themselves. – n. m. could be an AI Feb 26 '18 at 10:55
  • 1
    The second method is not only better, but the first method is downright dangerous because the enum and the strings can easily go out on sync. – Jabberwocky Feb 26 '18 at 11:02
  • @MichaelWalz - they can also get out of sync with the second method (e.g. if an enum value is not used as a designator, or the array initialisation includes additional elements). Admittedly there is more of a visual cue for the programmer when that happens. – Peter Feb 26 '18 at 11:13
  • `[STRING_WORLD] = "World"` is going to waste a lot of space should someone set `STRING_WORLD = INT_MAX` in the `enum` definition... – Andrew Henle Feb 26 '18 at 11:15
  • 2
    True story: A few months ago I was in a hurry and so chose the the first method in a program I'm writing at work. *Twice* since then I've managed to add items to the enum, but forgot to add them to the string table. In both cases this led to massive confusion which cost me significant time. My conclusion: don't use the first method. – Steve Summit Feb 26 '18 at 12:33
  • Possible duplicate of [Translate error codes to string to display](https://stackoverflow.com/questions/3975313/translate-error-codes-to-string-to-display) – harper Mar 01 '18 at 12:50

3 Answers3

14

The only advantage with the former is that it's backwards-compatible with ancient C standards.

Apart from that, the latter alternative is superior, as it ensures data integrity even if the enum is modified or items change places. However, it should be completed with a check to ensure that the number of items in the enum corresponds with the number of items in the look-up table:

typedef enum {
    STRING_HELLO,
    STRING_WORLD,
    STRING_N  // counter
} string_enum_type;

const char *string_enumerations[] = {
    [STRING_HELLO] = "Hello",
    [STRING_WORLD] = "World"
};

_Static_assert(sizeof string_enumerations/sizeof *string_enumerations == STRING_N,
               "string_enum_type does not match string_enumerations");

The above is the best method for a simple "enum - lookup table" coupling. Another option would be to use structs, but that's more suitable for more complex data types.


And finally, more as a side-note, the 3rd version would be to use "X macros". This is not recommended unless you have specialized requirements regarding code repetition and maintenance. I'll include it here for completeness, but I don't recommend it in the general case:

#define STRING_LIST          \
 /* index         str    */  \
  X(STRING_HELLO, "Hello")   \
  X(STRING_WORLD, "World")


typedef enum {
  #define X(index, str) index,
    STRING_LIST
  #undef X
  STRING_N // counter
} string_enum_type;


const char *string_enumerations[] = {
  #define X(index, str) [index] = str,
    STRING_LIST
  #undef X
};

_Static_assert(sizeof string_enumerations/sizeof *string_enumerations == STRING_N,
               "string_enum_type does not match string_enumerations");
T.J. Crowder
  • 1,031,962
  • 187
  • 1,923
  • 1,875
Lundin
  • 195,001
  • 40
  • 254
  • 396
  • "However, it should be completed with a check to ensure that the number of items in the enum corresponds with the number of items in the look-up table:" -- Perhaps worth noting is that this check is slightly more reliable with the OP's first approach. With the second, if an initialiser other than the last is missing, it will be undetectable by this check. –  Feb 26 '18 at 13:02
  • @hvd True, neither version is fool proof. For example, if someone assigns a value explicitly to one enum item (other than 0 for the first one), then both versions fail. To protect against the issue you mention, I suppose one could add a run-time check such as `for(size_t i=0; i – Lundin Feb 26 '18 at 13:39
  • Instead of `const char *string_enumerations[]` and a `_Static_assert` check, why not force the array size with `const char *string_enumerations[STRING_N]`? – chux - Reinstate Monica Feb 26 '18 at 15:40
  • @chux Because it doesn't add anything. C can protect against an initializer list that's too large, but not against one that's too small. So if you set the fixed size, but forget one array initializer, the program would compile cleanly. Although I guess leaving out the array size makes more sense in the version without designated initializers, as the size of the array depends completely on the amount of initializers used. – Lundin Feb 26 '18 at 15:49
  • "forget one array initializer" is not prevented much in this code either. `const char *string_enumerations[] = { [42] = "Hello", };` still makes for an array size of 43. One element pointers to a string and the other have a pointers of `0/NULL`. – chux - Reinstate Monica Feb 26 '18 at 16:02
  • So " array depends completely on the amount of initializers used" is more like the array element count is the greatest `enum` used + 1. – chux - Reinstate Monica Feb 26 '18 at 16:05
  • A completely different compile-time check can be obtained, under some compilers at least, by using a function and a `switch` statement, as in my answer. (Whether such a function is appropriate to the original question is another story.) – Steve Summit Feb 26 '18 at 23:16
  • @SteveSummit Such a switch will get optimized to some manner of look-up table anyhow. The gcc warning will only come if you omit the `default`, which is a bad thing to do, because that will also eliminate the out-of-bounds checking. Suppose for example that the programmer by accident passes an enumeration constant belonging to another enum. C has no built-in type safety for enums. – Lundin Feb 27 '18 at 07:50
  • @Lundin Having the `switch` turn into a lookup table is fine. (That's kind of the point!) But you make a very good observation about the potentially confounding influence of a `default` case. See my answer for a note on that. – Steve Summit Mar 01 '18 at 11:58
3

Another possibility might be to use a function, instead of an array:

const char *enumtostring(string_enum_type e) {
    switch(e) {
        case STRING_HELLO: return "hello";
        case STRING_WORLD: return "world";
    }
}

gcc, at least, will warn if you add an enum value but forget to add the matching switch case.

(I suppose you could try making this sort of function inline, as well.)


Addendum: The gcc warning I mentioned applies only if the switch statement does not have a default case. So if you want to print something for out-of-bounds values that somehow creep through, you could do that, not with a default case, but with something like this:

const char *enumtostring(string_enum_type e) {
    switch(e) {
        case STRING_HELLO: return "hello";
        case STRING_WORLD: return "world";
    }
    return "(unrecognized string_enum_type value)";
}

It's also nice to include the out-of-bounds value:

    static char tmpbuf[50];
    snprintf(tmpbuf, sizeof(tmpbuf), "(unrecognized string_enum_type value %d)", e);
    return tmpbuf;

(This last fragment has a couple of additional limitations, but this addendum is getting long already, so I won't belabor the point with them just now.)

Steve Summit
  • 45,437
  • 7
  • 70
  • 103
  • This code can be improved significantly by changing the parameter to `(const string_enum_type* e)` and then `switch(*e)`. That way you guarantee that an enum of correct type is passed, and not just any random `int`. That way, you outsource the out-of-bounds check to the caller. – Lundin Feb 27 '18 at 07:56
  • @Lundin Interesting idea, although I'm not sure even that would be sufficient. I've added a note to the answer to cover the out-of-bounds case. – Steve Summit Mar 01 '18 at 11:56
  • It would be enough to protect against the from enumerated type, as long as it is passed with a pointer. See [How to create type safe enums?](https://stackoverflow.com/questions/43043246/how-to-create-type-safe-enums) for various tips & tricks on that topic. – Lundin Mar 01 '18 at 12:22
  • @Lundin As I said, interesting idea, and this isn't the place for a long discussion on this, but the pointer technique by itself doesn't even protect against `string_enum_type x = 42; const char *p = enumptostring(&x)`, let alone more exotic transgressions. (Yes, I see all the valiant attempts at protection in [the linked thread](https://stackoverflow.com/questions/43043246/), but my point is that an author of `enumtostring()` can't necessarily depend on all that.) – Steve Summit Mar 01 '18 at 12:50
0

Another possibility is to user #defines.

In spite of the many cons of its use, the main benefit is that #defines take up no space unless they are used...

#define STRING_HELLO "Hello"
#define STRING_WORLD "World"
Alejandro Blasco
  • 1,295
  • 2
  • 20
  • 24