C++ O(1) lookup by derived type

Question

I have a vector of pointers to Base.

Invariant: only one of each derived type should be in that vector at any time.

I also want to be able to lookup the value with a given type in O(1). I can do this in O(n) easily, by checking dynamic_cast.

Basically, I want to replace my vector with a map or something. Is that possible?

Here's minimal example with the vector and the loop:

#include <functional>
#include <iostream>
#include <memory>
#include <type_traits>
#include <vector>

using namespace std;

typedef struct Base {
  virtual ~Base(){};
} Base;

vector<unique_ptr<Base>> baseList;

template <typename NarrowType,
          typename std::enable_if_t<
              ! std::is_same_v<Base, NarrowType> &&
                  std::is_base_of_v<Base, NarrowType>,
              bool> = true>
void ApplyFuncToType(function<void(NarrowType)> func) {
  // Want to get rid of this loop
  for (auto &base : baseList) {
    NarrowType *narrow = dynamic_cast<NarrowType *>(base.get());
    if (narrow) {
      func(*narrow);
    }
  }
}

// usage

int main() {
  typedef struct A : Base {
    void printA() { cout << "a" << endl; }
  } A;
  typedef struct B : Base {
    void printB() { cout << "b" << endl; }
  } B;

  baseList.push_back(make_unique<A>());
  baseList.push_back(make_unique<B>());

  ApplyFuncToType<A>([](A a) { a.printA(); });
}

Questions:

How can I enfore my invariant (one of each type max in container)
Would a unordered_map<type_info, unique_ptr<Base>> be a good solution to this? I have read that typeid is not consistent or safe to use or something, but am not sure exactly.

Edits/Clarification:

This is for a system where other classes can register their own types in this vector. i.e. the contents of the vector will change during runtime.

A similar approach is shown here, where an unordered_map is used to allow self-registered event callbacks.

If the O(1) average lookup time is important, I'd probably just amend your data structure to track at insertion time when a derived object is added, by index. Then you get the added benefit of being able to verify that no more than 1 such object is in there, as well, as being able to update the invariants in the future if they change. — Tumbleweed53, Dec 06 '20 at 02:14
Any `map` or `unordered_map` will be a hell slower than just using index as you exactly wish (O(1)): `std::vector>`, `enum class NarrowType { A, B, ... }.` What is actually against this solution? — bloody, Dec 06 '20 at 02:41
Then also `baseList[NarrowType::A]->printA();` does the same thing. No need for that whole template function. :D At least for presented use case. — bloody, Dec 06 '20 at 02:51
It is not guaranteed that `typeid` will give the same `std::type_info` object when used on two objects of the same type. However, those objects will be associated with the same `std::type_info::hash_code` and `std:type_info::typeindex` values. The catch with all that is that the `hash_code` and `typeindex` values aren't guaranteed to be unique for distinct types. so using `typeid` and things related to it isn't guaranteed to be unique. — Peter, Dec 06 '20 at 02:55
I'm doubtful that the O(1) requirement is even achievable, even if we can ASSUME the invariant has been enforced. Even a map gives O(log(N)) complexity for finding an element, where `N` is number of elements in the container. — Peter, Dec 06 '20 at 03:00
I'm questioning the goal itself: why would you need a O(1) solution when n is known at compile-time (number of derived types) and probably not that big? — YSC, Dec 06 '20 at 03:29
**`typedef struct Base {...} Base;`** - uh,you do know that you're writing C++, right? None of this C chatter belongs in C++ :) You want `struct Base {...};`. — Kuba hasn't forgotten Monica, Dec 06 '20 at 04:16
@Peter That a hash code is not guaranteed to be unique is obvious (due to the nature of hash codes). However, that you claim this for `std::type_index` as well makes me uncertain. I always was under impression that this is the exact intention of a `std::type_index` (to provide a unique index for a type). I recalled [std::type_index](https://en.cppreference.com/w/cpp/types/type_index) but couldn't find any note or hint for this. Can you provide a reference for your claim? — Scheff's Cat, Dec 06 '20 at 06:20
(Ignoring the compile-time requirement) I would go with `std::unordered_map>`. (Actually, I already did in a small case study something very similar: `using AppDataMap = std::map;`. I used `std::map` because, in my case, less storage was more important than faster access as I will have many instances of this - with probably only a few entries for each instance.) — Scheff's Cat, Dec 06 '20 at 06:27
@Scheff `std::unordered_map` doesn't make much sense for slowly changing data. A sorted `std::vector` would perform much better, I think, unless you had like thousands of values to store and were bleeding from cache misses for binary searches. I imagine that the application here calls for a couple dozen values at most, hundred if we're lucky - nothing beats `std::vector` for that. But in any case, all this is totally moot since these types are static and known - and that's what a `std::tuple` is for. — Kuba hasn't forgotten Monica, Dec 06 '20 at 06:36
@UnslanderMonica Your comment made me thinking about my case study I mentioned. A sorted `vector` of pairs might be in fact more appropriate than a `map`... :-) — Scheff's Cat, Dec 06 '20 at 06:50
Thanks for the comments all. The points about not really needing O(1) are well taken. — MHebes, Dec 06 '20 at 07:48
It's worth mentioning: don't use `dynamic_cast` for type testing or dynamic dispatch. Use virtual methods if you have a fixed set of functions and an unknown set of types. Read up on the visitor pattern if you a fixed set of types and an unknown set of functions to apply. And if you've got unknown number of both, you might have a design issue and should add some context to your question to see we can avoid that situation. — parktomatomi, Dec 06 '20 at 08:23

Kuba hasn't forgotten Monica · Accepted Answer · 2020-12-06T06:48:29.613

Yeah, sure, it's possible, but I'm not convinced you need it. After all, all your types are completely static.

Also, ApplyFuncToType shouldn't be taking std::function, but a generic argument, since you'll save on the cost of shoehorning things into std::function. You're not deducing any types anyway - because std::function is not a tool for that - and thus you have the call that includes the type parameter explicitly: ApplyFuncToType<A>.

And finally, it's probably wrong to pass A and B to the lambda by value - since then the instance the lambda is using is not the instance you so carefully deposited beforehand (!). It should be passed by const reference, or by reference if it's a non-const method:

// Do this
ApplyFuncToType<A>([](const A &a) { a.printA(); });
// Or do that
ApplyFuncToType<A>([](A &a) { a.printA(); });
// NO!
ApplyFuncToType<A>([](A a) { a.printA(); });

It's hard to deduce it ahead of time, but I imagine that you'd want to make A, B, ... non-copyable but they definitely should be movable (read on).

A Tuple of Pointers

All you really want is the below - and it doesn't care that the types are derived from some base, you can use any types you wish. You can of course add type constraints if you want to protect from bugs where wrong types are supplied to ptr_tuple.

#include <functional>
#include <memory>
#include <tuple>

struct A { void methodA() {} };
struct B { void methodB() {} };

template <class ...Args>
using ptr_tuple = std::tuple<std::unique_ptr<Args>...>;

ptr_tuple<A, B> instances;

template <typename T>
auto &instance()
{
    return std::get<std::unique_ptr<T>>(instances);
}

template <class T, class Fun, class ...Args>
void invoke(Fun &&fun, Args &&...args)
{
    auto *ptr = instance<T>().get();
    if (ptr) {
        std::invoke(fun, *ptr, std::forward<Args>(args)...);
    }
}

int main() {
    instance<A>() = std::make_unique<A>();
    instance<B>() = std::make_unique<B>();

    invoke<A>([](A& a){ a.methodA(); });
    invoke<B>([](B& b){ b.methodB(); });
}

Argument Deduction for Invoke/Apply

It's not even necessary to supply the explicit type parameter to invoke. We can deduce it. For that, we use a traits class that's sorely missing in C++ standard library:

// from https://stackoverflow.com/a/39717241/1329652
// see also
// https://github.com/kennytm/utils/blob/master/traits.hpp
// https://stackoverflow.com/a/27885283/1329652
// boost::callable_traits

template <typename T, typename = void>
struct function_traits;

template <typename R, typename... A>
struct function_traits<R (*)(A...)>
{
    using args_type = std::tuple<A... >;
    using arg0_class = std::decay_t<std::tuple_element_t<0, args_type>>;
};

template <typename R, typename C, typename... A>
struct function_traits<R (C::*)(A...)>
{
    using args_type = std::tuple<A... >;
    using arg0_class = std::decay_t<std::tuple_element_t<0, args_type>>;
};

template <typename R, typename C, typename... A>
struct function_traits<R (C::*)(A...) const>
{
    using args_type = std::tuple<A... >;
    using arg0_class = std::decay_t<std::tuple_element_t<0, args_type>>;
};

template <typename T>
struct   function_traits<T, std::void_t<decltype(&T::operator())> > 
: public function_traits<               decltype(&T::operator())  >
{};

And then we can deduce the needed type in invoke:

template <class Fun, class ...Args>
void invoke(Fun &&fun, Args &&...args)
{
    using arg0_class = typename function_traits<std::decay_t<Fun>>::arg0_class;
    auto *ptr = instance<arg0_class>().get();
    if (ptr) {
        std::invoke(fun, *ptr, std::forward<Args>(args)...);
    }
}

int main() {
    instance<A>() = std::make_unique<A>();
    instance<B>() = std::make_unique<B>();

    invoke([](A& a){ a.methodA(); });
    invoke([](B& b){ b.methodB(); });
}

A Tuple of Optional Values

Depending on what your A and B types really are, if they can be moved, then using dynamic memory allocation is totally unnecessary, you'd much rather keep them by value, e.g. with optional:

#include <functional>
#include <memory>
#include <optional>
#include <tuple>

struct A { void methodA() {} };
struct B { void methodB() {} };

template <class ...Args>
using opt_tuple = std::tuple<std::optional<Args>...>;

opt_tuple<A, B> instances;

template <typename T> auto &instance()
{
    return std::get<std::optional<T>>(instances);
}

template <class T, class Fun, class ...Args>
void invoke(Fun &&fun, Args &&...args)
{
    auto &opt = instance<T>();
    if (opt) {
        std::invoke(fun, *opt, std::forward<Args>(args)...);
    }
}

int main() {
    instance<A>().emplace(); // constructs A
    instance<B>().emplace(); // constructs B

    invoke<A>([](A& a){ a.methodA(); });
    invoke<B>([](B& b){ b.methodB(); });
}

Of course you can add the type-deduced variant of invoke just as before.

A type-id Stand In

Even though I really think that your original solution is in want of a problem - you should state what problem you're trying to solve, otherwise it smells of an XY problem - there of course is a better "type id" than type_id: an address of a function templated on a type. There'll be only one instance of it per program.

I don't think that the "O(1)" lookup is a real requirement, a very, very fast O(log(N)) lookup - way faster than you'd get from e.g. std::map, will work just as well for whatever your imaginary applications is.

Thus:

#include <cassert>
#include <functional>
#include <iostream>
#include <memory>
#include <stdexcept>
#include <type_traits>
#include <vector>

// here goes function_traits implementation from above

struct Base {};

template <typename T>
constexpr static bool is_derived_from_Base_v =
    !std::is_same_v<Base, T> && std::is_base_of_v<Base, T>;

class UniqueTypeObjects {
    using marker_type = void(*)();
    struct Pair {
        std::unique_ptr<Base> base;
        marker_type marker;
        Pair(std::unique_ptr<Base> &&base, marker_type marker) : base(std::move(base)), marker(marker) {}
        bool operator<(marker_type o) const { return marker < o; }
    };
    friend bool operator<(marker_type a, const Pair &o);
    template <typename T, typename = std::enable_if<is_derived_from_Base_v<T>>>
    struct Witness {
        static void marker() {}
    };

    std::vector<Pair> m_objects;

public:
    template <class Derived, class =
              std::enable_if_t<is_derived_from_Base_v<Derived>>>
    void insert(std::unique_ptr<Derived> &&obj) {
        auto constexpr marker = &Witness<Derived>::marker;
        auto it = std::lower_bound(m_objects.begin(), m_objects.end(), marker);
        if (it != m_objects.end() && it->marker == marker)
            throw std::logic_error("Attempting to insert an object of duplicate type");
        m_objects.emplace(it, std::move(obj), marker);
    }
    template <typename Derived, typename Fun,
              class = std::enable_if_t<is_derived_from_Base_v<Derived>>>
    void apply(Fun fun) const {
        auto constexpr marker = &Witness<Derived>::marker;
        auto it = std::lower_bound(m_objects.begin(), m_objects.end(), marker);
        if (it == m_objects.end() || it->marker != marker)
            throw std::runtime_error("No object found to apply the function to");
        std::invoke(fun, *static_cast<Derived*>(it->base.get()));
    }
    template <typename Fun,
              class = std::enable_if_t<is_derived_from_Base_v<
                typename function_traits<std::decay_t<Fun>>::arg0_class>>>
    void apply(Fun fun) const {
        using arg0_class = typename function_traits<std::decay_t<Fun>>::arg0_class;
        apply<arg0_class>(std::move(fun));
    }
};

bool operator<(void(*a)(), const UniqueTypeObjects::Pair &o)
{ return a < o.marker; }

char lastInvoked;

int main() {
    struct A : Base {
        void methodA() { lastInvoked = 'A'; }
    };
    struct B : Base {
        void methodB() { lastInvoked = 'B'; }
    };

    UniqueTypeObjects uto;
    uto.insert(std::make_unique<A>());
    uto.insert(std::make_unique<B>());

    assert(!lastInvoked);
    uto.apply([](A &a){ a.methodA(); });
    assert(lastInvoked == 'A');
    uto.apply([](B &b){ b.methodB(); });
    assert(lastInvoked == 'B');
}

But I still don't think it's necessary. If you truly have O(1) requirement, e.g. some sort of a realtime system, or system with deterministic execution timing, then the opt_tuple solution or its equivalent is the one you should use. Otherwise - good luck with the paperwork and test plans to ensure that UniqueTypeObjects works. I wrote the thing and even I wouldn't allow it in a realtime or hi-reliability codebase I maintained. Nothing beats static type safety and ensuring correctness by design, and you get that with the tuple approach (or its equivalent with a custom class).

Thank you for this great response. My application (I'm just exploring) would let user classes register their own structs into this vector. You can see a similar approach [here](https://stackoverflow.com/a/37905169/3554391) for distributed event creation and registration (they use an `unordered_map`). I don't think the tuple-of-types method is possible in this context, since the types aren't known beforehand (sorry for the XY problem!). Regardless, this is an awesome resource for me to learn from. Your points about not needing O(1) in the first place are well taken. Thanks again. — MHebes, Dec 06 '20 at 08:14
@MHebes "the types aren't known beforehand" Of course they are known beforehand, it's just a matter of establishing a way for the user to pass this type information inwards. I presume that you are building a library maybe. Letting the user instantiate something that collects those types would be in the API then. Also note that systems that need that sort of thing (Qt, wxWidgets, etc) just sidestep the issue and do their own type registration that gets rid of the indeterminacy of C++'s mechanism, i.e. they have their own TypeID. — Kuba hasn't forgotten Monica, Dec 06 '20 at 08:22

C++ O(1) lookup by derived type

1 Answers1

A Tuple of Pointers

Argument Deduction for Invoke/Apply

A Tuple of Optional Values

A type-id Stand In