How to add a bool into a struct without increasing sizeof (in case that there is padding in the struct)?

Question

I'm creating a hash table. A node of the hash table stores a KEY, a VALUE, and a flag (used or not):

template <typename KEY, typename VALUE>
struct Node {
    union { KEY key; };
    union { VALUE value; };
    bool used;

    Node() { }
};

The key and value is in a union, because they are created only when the Node is actually in use (this is important).

Now, suppose, that KEY or VALUE has some padding in it. It is a sensible need then to put used into the padding area (so Node can be smaller).

Is it possible to do this (automatically)?

If it is not possible generally, is it possible to do this, if KEY or VALUE has tail-padding?

Note, I've an idea how utilize tail-padding, however it has undefined behavior. Basically, the idea is to derive from KEY or VALUE, and put bool used there (in this case, current compilers put bool used into the tail, if the object has non-standard-layout). But unfortunately, used cannot be used until the KEY or VALUE actually created (new'd) - that's a problem, because when an empty Node is created, neither the KEY or VALUE is created (it's an open addressing hash table, empty nodes exist).

Note2: I'm using this approach only when there is no special empty KEY or VALUE value. If KEY or VALUE has a special empty value, then of course, I don't need to use a separate bool used.

I don't think there is a safe way to do this that will work for any KEY and VALUE type. As a simple alternative, how about allocating a separate packed-bits array (or `std::bitset` if you prefer) and storing all your boolean/used-values there? That would only increase your Hashtable's memory usage by approximately 1 bit per Node-object. — Jeremy Friesner, Nov 08 '18 at 22:05
@JeremyFriesner: Thanks for the suggestion. The problem with the separate bits-array approach is that it's allocated somewhere else, so for large hash tables, it means an extra cache-miss. A possible workaround (for small KEY/VALUE elements) is to put 8/16/32 bit flags interleaved into the table. However, it makes the implementation complicated, and slower. — geza, Nov 08 '18 at 22:11

score 1 · Answer 1 · answered Nov 08 '18 at 19:17

So I thought about this a little more and realized that, although you can't use the padding inside the stored objects, you can try to arrange Node's members so they take up the least amount of space. Of course you could hard-code that, but I thought a more elegant way would be to automate it.

To find the smallest arrangement we need to generate all possible arrangements and pick the smallest one. Generating all arrangements can be accomplished with metafunction permutations, which generates all permutations of a pack of types:

template <typename P1, typename P2>
struct merge {};

template <template <typename...> class P, typename... Ts, typename... Us>
struct merge<P<Ts...>, P<Us...>> {
    using type = P<Ts..., Us...>;
};

template <typename T, typename P>
struct prepend {};

template <typename T, template <typename...> class P, typename... Packs>
struct prepend<T, P<Packs...>> {
    using type = P<typename merge<P<T>, Packs>::type...>;
};

// N is the number of rotations to go
template <std::size_t N, typename Pack, typename = void>
struct permutations_impl {};

template <template <typename...> class P, typename... Ts>
struct permutations_impl<0, P<Ts...>> {
    // All rotations done, break the recursion
    using type = P<>;
};

template <std::size_t N, template <typename...> class P, typename T>
struct permutations_impl<N, P<T>> {
    using type = P<P<T>>;
};

template <std::size_t N, template <typename...> class P, typename F, typename... Rest>
struct permutations_impl<N, P<F, Rest...>, std::enable_if_t<(sizeof...(Rest) && N != 0)>> {
    using PermuteRest = typename permutations_impl<sizeof...(Rest), P<Rest...>>::type;
    using NextRotation = typename permutations_impl<N-1, P<Rest..., F>>::type;

    using type = typename merge<typename prepend<F, PermuteRest>::type, NextRotation>::type;
};

template <typename Pack>
struct permutations {};

template <template <typename...> class P, typename... Ts>
struct permutations<P<Ts...>> {
    using type = typename permutations_impl<sizeof...(Ts), P<Ts...>>::type;
};

To pick the smallest of all permutations we can then do something like this:

template <typename Pack>
struct min_size_element {
    static constexpr std::size_t min_index(std::initializer_list<std::size_t> l) {
        std::size_t min = *l.begin();
        std::size_t idx = 0, result = 0;
        for(auto it = l.begin(); it != l.end(); ++it, ++idx) {
            if(*it < min) min = *it, result = idx;
        }
        return result;
    }
};

template <typename... Ts>
struct min_size_element<std::tuple<Ts...>> : min_size_element<void> {
    using type = std::tuple_element_t<min_index({sizeof(Ts)...}), std::tuple<Ts...>>;
};

template <typename... Ts>
struct smallest_tuple {
    using tuples = typename permutations<std::tuple<Ts...>>::type;
    using type = typename min_size_element<tuples>::type;
};

The Node class would then store the key, value, and used flag in a tuple chosen by smallest_tuple. The elements need to be accessed by type (because we don't know their index), so key and value elements need to be wrapped in unique wrapper types. An implementation of the Node class with wrappers and accessors for key and value could look like this:

template <typename K, typename V>
class Node {
    union KeyWrap {
        struct{} _;
        K key;

        constexpr KeyWrap() : _() {}
    };
    union ValueWrap {
        struct{} _;
        V value;

        constexpr ValueWrap() : _() {}
    };
    // Need to wrap key and value types because tuple elements are accessed by type
    using Storage = typename detail::smallest_tuple<bool, KeyWrap, ValueWrap>::type;

    Storage mStorage;

public:
    constexpr Node() = default;

    ~Node() {
        if(this->used()) {
            this->key().~K();
            this->value().~V();
        }
    }

    Node& operator=(std::pair<K, V> entry) {
        if(this->used()) {
            this->key() = std::move(entry.first);
            this->value() = std::move(entry.second);
        } else {
            new(&std::get<KeyWrap>(mStorage).key) K(std::move(entry.first));
            new(&std::get<ValueWrap>(mStorage).key) V(std::move(entry.second));
            std::get<bool>(mStorage) = true;
        }
        return *this;
    }

    bool used() const {
        return std::get<bool>(mStorage);
    }

    K& key() {
        assert(this->used());
        return std::get<KeyWrap>(mStorage).key;
    }

    V& value() {
        assert(this->used());
        return std::get<ValueWrap>(mStorage).value;
    }
};

Live demo

I also tried to automatically detect whether a type has some padding, but wasn't able to do it reliably, and only for types which can be derived from. Have a look at my answer here.

Thanks for the answer. For this particular problem, I think that if I just swap the key/value order if key has smaller alignment provides the same result. — geza, Nov 08 '18 at 21:34
I thought I had one when I wrote that comment, but now I'm thinking you're right. You need to pack the two members with the smallest alignment together, either it's better if there's space for the bool or it doesn't make a difference. — Pezo, Nov 08 '18 at 22:13

How to add a bool into a struct without increasing sizeof (in case that there is padding in the struct)?

1 Answers1