C++ constexpr array lookup: memory overhead? Other gotchas?

Question

I've implemented a constexpr ~~map~~ array lookup based on this SO answer, but it leaves me wondering now what the memory overhead might be like if the ~~map~~ array is very large, and what other gotchas might exist with this technique, particularly if the constexpr function cannot be resolved at compile time.

Here is a contrived code example that hopefully makes my question more clear:

example.h:

enum class MyEnum
{
  X0,
  X1,
  X2,
  X3,
  X4,
  X5
};

struct MyStruct
{
  const MyEnum type;
  const char* id;
  const char* name;
  const int size;
};

namespace
{
  constexpr MyStruct myMap[] = {
    {MyEnum::X0,"X0","Test 0", 0},
    {MyEnum::X1,"X1","Test 1", 1},
    {MyEnum::X2,"X2","Test 2", 2},
    {MyEnum::X3,"X3","Test 3", 3},
    {MyEnum::X4,"X4","Test 4", 4},
    {MyEnum::X5,"X5","Test 5", 5},
  };

  constexpr auto mapSize = sizeof myMap/sizeof myMap[0];
}

class invalid_map_exception : public std::exception {};

// Retrieves a struct based on the associated enum
inline constexpr MyStruct getStruct(MyEnum key, int range = mapSize) {
  return  (range == 0) ? (throw invalid_map_exception()):
          (myMap[range - 1].type == key) ? myMap[range - 1]:
          getStruct(key, range - 1);
};

example.cpp:

#include <iostream>
#include <vector>
#include "example.h"

int main()
{
  std::vector<MyEnum> enumList = {MyEnum::X0, MyEnum::X1, MyEnum::X2, MyEnum::X3, MyEnum::X4, MyEnum::X5};
  int idx;

  std::cout << "Enter a number between 0 and 5:" << std::endl;
  std::cin >> idx;

  MyStruct test = getStruct(enumList[idx]);

  std::cout << "choice name: " << test.name << std::endl;

  return 0;
}

Output:

Enter a number between 0 and 5:
1
choice name: Test 1

Compiled with g++ with -std=c++14.

In the above example, although getStruct is a constexpr function, it cannot be fully resolved until runtime since the value of idx is not known until then. May that change the memory overhead when compiled with optimization flags, or would the full contents of myMap be included in the binary regardless? Does it depend on the compiler and optimization setting used?

Also, what if the header file is included in multiple translation units? Would myMap be duplicated in each one?

I imagine this could be important if the ~~map~~ array becomes enormous and/or the code is going to be used in more resource constrained environments such as embedded devices.

Are there any other potential gotchas with this approach?

An immediate gotcha is a O(N) lookup complexity, which is pretty bad for a map. — , Feb 06 '19 at 22:29
@Frank indeed. I shouldn't have called it a map because it isn't really a map. It's a function to search an array. I naively followed the naming conventions of the linked SO post. I'll edit the question to correct this mistake. — NanoWizard, Feb 07 '19 at 00:15

score 2 · Accepted Answer · answered Feb 06 '19 at 22:40

If you call a constexpr function with a non-constant expression, it will call the function at run time.

If you call getStruct with a constant expression, the compiler can just call the function at compile time. Then, the getStruct function will be "unused" at runtime, and the compiler will probably optimise it out. At this point, myMap will also be unused, and be optimised out.

In terms of runtime size, it would actually probably be smaller than an std::unordered_map or std::map; It literally stores the minimum information necessary. But it's lookup time would be a lot slower, as it has to compare all the elements individually in O(N) time, so it doesn't actually do what a map does (reduce lookup time).

If you want to make it more likely that it is optimised out, I would ensure that it is only used in constant-expression situations:

template<MyEnum key>
struct getStruct
{
    static constexpr const MyStruct value = _getStruct(key);
}

Here's some compiler output that shows that the map is optimised out entirely

And about including it in multiple translation units, it would be duplicated in every one since you use an anonymous namespace to define it. If it was optimised out in all of them, there would be no overhead, but it would still be duplicated for every translation unit that you do a runtime lookup in.

That's a good point in your comparison with `std::unordered_map` and `std::map`. I suppose calling it `myMap` is misleading because it's not really a map, it's a constexpr array. Thank you, your answer is very helpful! — NanoWizard, Feb 07 '19 at 00:11

C++ constexpr array lookup: memory overhead? Other gotchas?

example.h:

example.cpp:

Output:

1 Answers1