8

I have been using a pattern described in this SO question in my code, to make Compile-time registration lists of various kinds: C++ type registration at compile time trick

For instance if you have a bunch of lua callback functions and you don't want to forget to register them with some lua state, you might declare them using a macro that puts a template type which knows their name and function pointer into a list, and then later you have a one-liner which registers all the functions.

A limitation of this technique (as described in the SO answer) is that if you have n items in the list, it requires template recursion depth O(n) to evaluate. This is not ideal, I actually have quite a few lua callback functions already...

I had believed that the O(n) recursion depth is unavoidable for various reasons, however as I recently learned from Yakk in this (unreleated) answer, some fundamental things which I naively thought required O(n) can actually be done in O(log n) depth. Switch statement variadic template expansion

In particular there's no reason anymore that the datastructures involved should require O(log n) template depth for their operations.

The part I'm not sure about is the Rank trick. From the referenced code, this template

template <int N>
struct Rank : Rank<N - 1> {};

template <>
struct Rank<0> {};

is a key ingredient. It is expensive in terms of template depth though -- instantiating Rank<N> requires template depth N. The important property that it has though, is that if a function f is defined which is overloaded using many different rank types, overload resolution always selects the overload of largest rank, because that is the "most derived instance". E.g. if we have this code:

bool f(Rank<0>) { return false; }
int f(Rank<1>) { return 0; }
float f(Rank<2>) { return 0; }
double f(Rank<3>) { return 0; }

then it's always the case that at any point in the code, decltype(f(Rank<100>{})) has type equal to the return value of the most recently defined overload. I.e. these assertions pass

bool f(Rank<0>) { return false; }
static_assert(std::is_same<bool, decltype(f(Rank<100>{}))>::value, "D:");
int f(Rank<1>) { return 0; }
static_assert(std::is_same<int, decltype(f(Rank<100>{}))>::value, "D:");
float f(Rank<2>) { return 0; }
static_assert(std::is_same<float, decltype(f(Rank<100>{}))>::value, "D:");
double f(Rank<3>) { return 0; }
static_assert(std::is_same<double, decltype(f(Rank<100>{}))>::value, "D:");

Is there a way to do this which does not require template recursion depth O(n)?

Perhaps using more, carefully chosen parameters for the function overloads (?)

Community
  • 1
  • 1
Chris Beck
  • 15,614
  • 4
  • 51
  • 87
  • Did you *measure* the taken? Is it *significant* to you as a proportion of the overall time? Or is this for "academic" interest? – Tony Delroy Aug 27 '15 at 06:16
  • @TonyD: I'm not really concerned about the time so much as hitting the maximum template recursion depth (typically 256) and having the compilation process error out. – Chris Beck Aug 27 '15 at 06:17
  • I use this pattern in a few places -- it would be nice if the limit were pretty large, like, in the millions, rather than in the hundreds, so I don't have to keep track of this stuff. Getting that is more important than getting `O(log n)`, outside of "academic" interest. – Chris Beck Aug 27 '15 at 06:23
  • 1
    The crucial thing that makes this work is that `Rank` is a direct or indirect base class of `Rank` for all `M < N`. You can't really avoid linear instantiations. – T.C. Aug 27 '15 at 06:27
  • @ChrisBeck I mention runtime performance because you're ostensibly going out of your way to use compile-time techniques over either having your macro create some static object with a constructor that registers a single function, or adds the name and function-pointer to a Standard library container for post-`main()` registration. What kind of registration function are you calling that works better with compile-time (function-pointer) type information? Maybe if we could see the lua registration code you end up calling, we could appreciate/assess the need or benefit.... – Tony Delroy Aug 27 '15 at 06:53
  • @TonyD: oftentimes the functions that I want to register a member functions of some class. To avoid creating parallel lists (the whole point of this is to avoid that maintanence burden), they have to be declared and defined in the class body. The template technique can be easily adapted to that setting. Creating static objects with constructors that are initialized inline in some class body and talk to some global static registry object... is problematic. The lua state in question doesn't exist until long after static initialization. – Chris Beck Aug 27 '15 at 06:57
  • I could include some lua code but I think it would be more likely distracting than helpful. – Chris Beck Aug 27 '15 at 06:59
  • @TonyD: I changed my mind, here I typed up the a sketch of the sequence of designs that led down this path: http://hastebin.com/asopumikat.cs If you think there is a much better way to engineer this I'm all ears, but having thought about it quite a bit, I've gotten the most traction with the templates. – Chris Beck Aug 27 '15 at 07:44
  • I am also just interested in the template question for its own sake now though. At least as I understand this trick is one of the easiest ways to try to get around the pure functional nature of TMP and create "slots" that can hold types and be reassigned. The C preprocessor can do this to a limited extent but you can't use it accumulate lists like this AFAIK – Chris Beck Aug 27 '15 at 08:07
  • @ChrisBeck: can't access hastebin from here, but will have a look later. Agree it's an interesting problem space to explore... was doing [something similar](http://stackoverflow.com/a/32223343/410767) yesterday. – Tony Delroy Aug 27 '15 at 10:29
  • Why not the analog way? http://coliru.stacked-crooked.com/a/8be5ae509e6bcf9d – Johannes Schaub - litb Aug 27 '15 at 11:46
  • @TonyD: can you access gist? https://gist.github.com/cbeck88/f081023f40b844417aac – Chris Beck Aug 27 '15 at 14:29
  • @ChrisBeck are you OK if the f's are templates? – Johannes Schaub - litb Aug 27 '15 at 14:41
  • @JohannesSchaub-litb sure, that's an implementation detail. for me it will all be hidden behind a macro anyways – Chris Beck Aug 27 '15 at 14:43
  • @ChrisBeck so my `int******` trick would appear to do. You can generate an arbitrarily long `T**...*` with N stars in log(N) time. – Johannes Schaub - litb Aug 27 '15 at 14:52
  • @JohannesSchaub-litb: so how does this trick work? Coliru seems to have a linker error with it – Chris Beck Aug 27 '15 at 14:54
  • @ChrisBeck of course. The linker error was to show the correct function is called. I left out the definitions of `f` on purpose - the definition doesn't matter :) – Johannes Schaub - litb Aug 27 '15 at 14:56
  • @ChrisBeck alternatively you can use this approach, and always reduce one of the `const`, and pass a non-qualified pointer as argument: http://coliru.stacked-crooked.com/a/0f81c483bcfbdc04 . I feel like the other approach is superior though. This, on the other hand, doesn't require the templates. – Johannes Schaub - litb Aug 27 '15 at 14:57
  • @JohannesSchaub-litb: so I don't understand how I actually use that though. What expression do I substitue for `decltype(f(Rank<100>{}))` in your system? It's not like `decltype(f((int************)0))` is going to select an overload `foo f(int***) { return foo{}; }`? (It would say "cannot convert ..."?) – Chris Beck Aug 27 '15 at 15:01
  • @ChrisBeck what proposal do you look at, the `const` one or the `T*` one? On both cases you would be doing something like `generate_nullpointer_with_N_stars<1000>()`. If you optimize that function to work in log(N) time, you would need 10 instantiates here (perhaps 20 to compute the log of 1000). – Johannes Schaub - litb Aug 27 '15 at 15:02
  • @JohannesSchaub-litb: so the point of the original version is that it creates this type expression `#define GET_REGISTERED_TYPES decltype(f(Rank<100>{}))` which I can "assign" to, and it gives me a way that I can overwrite that "slot", and still use the same expression to access the new value. Then to gather a long list spead over many places in code, I just use some typelists and an append metafunction. In your examples you are defining a bunch of functions with various return values, but I don't see what the equivalent `GET_MOST_RECENT_VALUE` expression is – Chris Beck Aug 27 '15 at 15:08
  • `GetTypes(Tag, MAGIC)` needs to return a list of types. And we need to be able to define a new `GetTypes(Tag,?)` based off the previous list of types that will be a preferred overload to (all) previous `GetTypes` when passed `MAGIC`, constructed via macro from the return value of the previous `GetTypes`. If `MAGIC` is `int*` with some large number of stars, and the GetType is a template function that takes a `T*` with an increasing number of stars, this qualifies. I'm wondering if there is a better way: can we build a structure for MAGIC and ? that is less linear, and more tree-like? – Yakk - Adam Nevraumont Aug 27 '15 at 15:11
  • There's another formulation of the original version in terms of member pointers, where the conversion is inversed. You pass `int Rank<0>::*` as argument, and start defining the functions at `int Rank<2000>::*`. Perhaps that qualified for you aswell, but doesn't free you of the need for the many instantiations, tho. – Johannes Schaub - litb Aug 27 '15 at 15:13

1 Answers1

7

To avoid error of the type:

template instantiation depth exceeds maximum of 900 (use -ftemplate-depth= to increase the maximum) instantiating 'struct Rank<1148u>'

without increasing globally maximum template depth, you may instantiate intermediate templates:

// To allow instantiation of Rank<1000> with template-depth at 256
template struct Rank<250>;
template struct Rank<500>;
template struct Rank<750>;
template struct Rank<1000>;

You may also have an helper:

namespace detail
{

    template <template <std::size_t> class C,
              typename Seq,
              std::size_t BlockSize>
    struct Instantiate_Impl;

    template <template <std::size_t> class C,
              std::size_t... Is,
              std::size_t BlockSize>
    struct Instantiate_Impl<C, std::index_sequence<Is...>, BlockSize>
    {
        std::tuple<C<(Is * BlockSize)>...> dummy;
    };
}

template <template <std::size_t> class C,
          std::size_t N,
          std::size_t BlockSize = 250>
struct Instantiate :
    detail::Instantiate_Impl<C,
                             std::make_index_sequence<1 + N / BlockSize>,
                             BlockSize>
{};

And then

template struct Instantiate<Rank, 2000, 250>; // Rank<2000> is now instantiated.
Jarod42
  • 203,559
  • 14
  • 181
  • 302