2

I am teaching myself c++ by doing a series of exercises. I liked the idea of working out how hash tables could be done using just the language and no std calls. I discovered that you cant do "class member function partial template specialisation" and so worked on options for how this would be achieved. The code below is a stripped down version of the useful parts of a hash table. Leaving just enough to show the structure of what is needed.

Is there a better way to construct the class - IE a better idiom to solve partial specialisation?

I have tried full class specialisation and struct partial template specialisation but this seems the best so far.

I would be interested in comments about other ways to call the members and fuctions. I have thought of:

  1. A 'using' line for all the members/function in the fuctions.
  2. One 'using' line to create a typedef that can be used as the prefix.
  3. Putting the full cast at each use.
#include "stdio.h"

template <typename K, typename H>
class hash_table_common {
public:
    H& hash_type()                          // Standard CRTP helper function
    {
        return static_cast<H&>(*this);
    }
    int& operator[](K key)
    {
        return hash_type().get_value(key);
    }
    size_t hash(int key)
    {
        return key % 10;
    }
    int m_storage[10]{ 0 };
};

template <typename K>
class hash_table : public hash_table_common<K, hash_table<K>> {
public:
    class hash_table_common<K, hash_table<K>>& hash_type()
    {
        return static_cast<hash_table_common<K, hash_table<K>>&>(*this);
    }
    int& get_value(K key)
    {
        int hashable = 3; // value_to_int(); Real code will go here, for this demo it works for one value!
        int index1 = hash_type().hash(hashable);
        return hash_type().m_storage[index1];
    }
};

template <>
class hash_table<const char*> : public hash_table_common<const char*, hash_table<const char*>> {
public:
    class hash_table_common<const char*, hash_table<const char*>>& hash_type()
    {
        return static_cast<hash_table_common<const char*, hash_table<const char*>>&>(*this);
    }
    int& get_value(const char* key)
    {
        int hash_as_int = (int)key[0];
        int index = hash_type().hash(hash_as_int);
        return hash_type().m_storage[index];
    }
};
#endif

int main() {
    class hash_table<const char*> a;
    class hash_table<float> b;
    a["word"] = 3;
    b[4.5f] = 14;
    printf("%d %d", a["word"], b[4.5f]);
    return 0;
}

  • 3
    Unrelated technical note: [What are the rules about using an underscore in a C++ identifier?](https://stackoverflow.com/questions/228783/what-are-the-rules-about-using-an-underscore-in-a-c-identifier) – user4581301 Apr 22 '21 at 21:18
  • 2
    `int& operator=(int value) { return value; };` - You'll return a dangling reference here (and you shouldn't put `;` after function definitions) – Ted Lyngmo Apr 22 '21 at 21:19
  • 5
    imho its a bit too much code. The working code part could make a question for https://codereview.stackexchange.com/ while the non-working part is hidden behind the #ifdef and the error message should be in the question. – 463035818_is_not_an_ai Apr 22 '21 at 21:19
  • 5
    asking for "the most acceptable c++ styled .." is suspecitble to be opinion based. [`std::hash`](https://en.cppreference.com/w/cpp/utility/hash) is "accepted", though I dont nkow if it is "most accepted" ;). I suggest to focus on either the general question in the title, a general question about hash or the specific error in the code, but not all at the same time. Often an answer to a focused question helps to see the bigger picture (+1 anyhow :) – 463035818_is_not_an_ai Apr 22 '21 at 21:30
  • 1
    Why do you need any specialization for this? `std::unordered_map a` handles `a["word"] = 3` without any specialization needed, so you can too. Oh, you want a `map`? Ugh. – Mooing Duck Apr 22 '21 at 21:36
  • Yes I know the underscore is naff. I just wanted to try to minimise the clutter. If not an _ then what would you do for a case like this? – StoneMonkeyMark Apr 22 '21 at 21:37
  • 1
    `base::_` looks like a `base::operator H&` – 463035818_is_not_an_ai Apr 22 '21 at 21:39
  • Yes I worried it was all too much in one post. But I wanted to show I tried to do some homework and to allow the variations to be compared against each other. – StoneMonkeyMark Apr 22 '21 at 21:39
  • 2
    `int hash_as_int = *((int*)&key);` well that's super undefined behavior. What if the Key is only 1 byte? – Mooing Duck Apr 22 '21 at 21:40
  • Yes, Mooing Duck, I did specifically want to handle c-strings and also different methods to save off the values. Non sized items is what pushes it to a specialisation case. – StoneMonkeyMark Apr 22 '21 at 21:42
  • 1
    @StoneMonkeyMark It'll be UB even if `sizeof(key) >= sizeof(int)` – Ted Lyngmo Apr 22 '21 at 21:47
  • 2
    Why not start with this one: "How to get the last variant to compile?". Then you should include a [mcve] and the compiler error. Remove everything from your code here that is not needed to reproduce the error. You may add a [link to your complete code](https://godbolt.org/z/hqEnzMnhz) of course. Consider that looking at several different `base` and `derived` at the same time is confusing. – 463035818_is_not_an_ai Apr 22 '21 at 21:47
  • And the hash is just a toy - I am also working on the real stuff that goes in there. I wanted to trim it to a bare minimum to understand the moving parts and why the complexity in the class functions is there. – StoneMonkeyMark Apr 22 '21 at 21:51
  • I found the _() from here (https://www.fluentcpp.com/2017/08/11/how-to-do-partial-template-specialization-in-c/) which is turn was from Arthur O’Dwyer’s CppCon talk. – StoneMonkeyMark Apr 22 '21 at 22:12
  • @TedLyngmo so you mean it was missing the reference right. IE it should be `int& operator=(int& value) { return value; }` – StoneMonkeyMark Apr 22 '21 at 22:20
  • @StoneMonkeyMark Yes, that wouldn't return a dangling reference. It's a bit odd to have a `base::operator=` that returns an `int&` instead of `base&` though. – Ted Lyngmo Apr 23 '21 at 04:58
  • 1
    @TedLyngmo Thanks, that is old (incorrect) stuff from various testing. Drat I thought I had trimmed it all down to the smallest parts. I will try a new post shortly and cover all the points raised here. Thanks all. – StoneMonkeyMark Apr 23 '21 at 09:01
  • I reposted a simpler version over on codereview. Here is a link to it: https://codereview.stackexchange.com/questions/259908/is-this-a-good-way-to-construct-the-class-for-a-c-template-partial-specialisat – StoneMonkeyMark Apr 23 '21 at 16:40
  • So I was told they dont want this kind of question. They only take fully finished code - not interested in design structure it seems. So I edit it for the suggestions given so far. – StoneMonkeyMark Apr 23 '21 at 19:51
  • Is the specialisation really necessary? How much does the code differ between the different data types? In C++17 you might as well simply use a `constexpr if` in combination with a `std::is_same` but I can have a more detailed look at your code tomorrow... – 2b-t Apr 23 '21 at 23:56
  • @2b-t I know all this is done for me with std, I am treating is as a learning exercise. I tried to read the std implementation code but it is too complex for me just now, and doing is always a better way to learn (grin). I started down the typeid and decltype route, but then added 'get this done at compile time' to my exercise. I will go and see how is_same is implemented and constexpr could work and get it all compile time. That would be a good thing to learn. For this case practising CRTP was good as I can see myself using that in the future for other things. – StoneMonkeyMark Apr 24 '21 at 10:46
  • @2b-t If I dont have the compile time type check then for a general hash taking const char* as well as "compile time size known" types then the backing for storage will be very different. The hash to uint part will likely be different unless the size is stored in the backing storage for the key and data. I also dont know how to reuse the storage of any const supplied keys or values rather than always copying out the data to the backing storage. Hints on that would be cool too. Thanks so much for having a look at this. – StoneMonkeyMark Apr 24 '21 at 10:53

3 Answers3

2

First, lets consider your example of implementing a hashtable. I made some small modifications to your main:

int main() {    
    my_hash_table<const char*> c;
    my_hash_table<float> d;
    c["word"] = 3;
    d[4.5f] = 14;
    std::cout << c["word"] << " " << d[4.5f] << "\n";
}

Lets take this as test case that we want to pass.

Ignoring deficiencies of the implementation and only talking about design, this is what is needed to "pass the test" and have specializations for const char* and float:

#include <map>
#include <iostream>
#include <type_traits>

template <typename T>
struct my_hash {
    size_t operator()(const T& t){ return t;}
};
template <typename key_t,typename hash = my_hash<key_t>,typename value_t = int>
struct my_hash_table {
    std::map<size_t,value_t> table;
    value_t& operator[](key_t key){ return table[hash{}(key)]; }
};
template <>
struct my_hash<const char*> {
    size_t operator()(const char* t){ return t[0]; }
};
template <>
struct my_hash<float> {
    size_t operator()(const float& t) { return t*100; }
};

Don't get me wrong, but your code looks like you tried to squeeze everything into one class. Not everything related to a class must be inside the class' definition. Rather, the less there is in the definition of a class, the better. Functors are typically rather light-weight, they have an operator() (ie they can be called) and often thats it. You can easily add specializations for my_hash.

Also, inheritance has its place, but to be honest, I don't understand what purpose it serves in your code. Last, but not least you are mixing the hashing function with the container, but those are seperate concerns. A different container might use the same hashing function and you might want to use the same data structure with a different hash.

"Class member function partial template specialisation" isn't a thing. Though, that does not pose a problem per se. It's just a tool we do not have in our box. When you want member function of a class template to do something special depending on only one of the paramters, you can do that. Either as above, or C++17 introduced if constexpr :

template <typename A> struct foo { void operator()() {} };
template <> struct foo<int> { void operator()(){ std::cout << "A is int\n";}};

template <typename A,typename B>
struct bar {
    void do_something() {
        
        foo<A>{}();

        if constexpr (std::is_same_v<A,int>){
            std::cout << "A is int\n";
        }
    };
};

Live Example

463035818_is_not_an_ai
  • 109,796
  • 11
  • 89
  • 185
  • 1
    Wow, what can I say. Lots to learn from this. I will take it away and see if I can change my code to match. I tried the struct approach as I liked that but I never got the syntax wrong. I was missing the `typename hash = my_hash` and the `hash{}`. I take it I can call non-operator functions in the same way. Let me try stuff and get back to you. – StoneMonkeyMark Apr 24 '21 at 18:56
  • _Ignoring deficiencies of the implementation and only talking about design_ And thanks for understanding the key part of what I was asking. – StoneMonkeyMark Apr 24 '21 at 18:58
  • I now see would not put several functions in the one struct. So it would be a functor per call. That will start to look a big messy with 4 or 5 fuctions as they all belong in the one specialisation. Especially with the ordering to get the declaration right. – StoneMonkeyMark Apr 24 '21 at 19:05
  • OK so I am understanding it better and I dont think that is really an answer to the original question. Perhaps using a hash map is too distracting. The main goal was to get partial specialisation where both parts can use the functions and members of the other part. Sure you have solved the code but I am still not sure we are addressing the best replacement for the _tool we do not have in our box_. For this case I might be able to keep going and get a full hash map working using this (and I was doing so with my approaches, which I will now try yours) but I was side tracked by the main question. – StoneMonkeyMark Apr 24 '21 at 19:46
  • @StoneMonkeyMark the thing is that there is no exact replacement for the feature you miss (if there would be an *exact* replacement, it would be no replacement, but just the feature). You have to consider the problem at hand and choose a different solution. I admit that already while writing I was afraid that I didnt completely understand the actual problem you are trying to solve – 463035818_is_not_an_ai Apr 24 '21 at 20:06
  • OK I like lots of your solution, but can I post a solution myself as well? I used the struct approach. Which I can now get to compile (thanks so much) but kept all my original structure there. I had to pull in a reference to the class so as to get its functions and members. This complicates it by having the struct declaration at the top. PS. I will now forever use name_t for my typename values. It is so obvious. – StoneMonkeyMark Apr 24 '21 at 20:30
  • @StoneMonkeyMark i dont understand what you mean with "type capabilities". C++ has no reflection if you mean that. Perhaps better open a new question – 463035818_is_not_an_ai Apr 26 '21 at 16:37
  • @StoneMonkeyMark templates is a language feature and templates allow you to query properties of types, for that there are lots of traits in `` – 463035818_is_not_an_ai Apr 26 '21 at 16:39
  • Sorry, yes type reflection in the language itself rather than in the libraries. It dawned on me that the compiler teams are doing all this and not to any low-level language standard, but to a high level API standard. It made me realise C++ can’t be used 'by itself'. I know that sounds odd, but I came from old school C programming. I am starting to see the subtlety that is the seed of the rants from rust programmers now. – StoneMonkeyMark Apr 26 '21 at 17:13
  • And how are you @ naming me and I cant @ name you back. I keep trying. It worked before. Sorry about that. – StoneMonkeyMark Apr 26 '21 at 17:13
  • you cannot put the @ here, because its a comment on my answer. I get a notification anyhow. Back on topic: Sorry I really dont understand what you mean with "C++ cant be used by itself". Specifically the traits in `` are rather straight-forward to implement. Its a matter of convenience to have them in the standard, but not having `` would not remove anything that you can do as a programmer (there might be exceptions that rely on some implementation defined behavior, but I am not aware of any) – 463035818_is_not_an_ai Apr 26 '21 at 17:20
  • I know the comment seems odd to well versed C++ users! I like to see how things are done and I thought a simpler version of is_same could be written by me. So I looked the MS code (and could not get to the bottom of over 500kb of source code that was included). But while doing that I realised it is not in the language definition it is the std library code. I suppose that is the same as C language does not allow you access to the stack. So you have to consider stdlib to be part of it. I am doing all this nutty approach as I would like to (one day) be able to read the code in the libs – StoneMonkeyMark Apr 26 '21 at 18:04
  • So I read this https://stackoverflow.com/questions/36416773/how-does-rust-implement-reflection and rust sounds just like what I saw in the C++ libs. Same hash as the id as well. I am also doing this not just to learn C++ but to also get a better understanding of all languages. I am also looking at, but not programming in Rust, Zig and Jai. This has been a really useful journey and you have been very patient. Thanks so much. – StoneMonkeyMark Apr 26 '21 at 18:14
  • @StoneMonkeyMark cppfref usually has possible implementations. Also for [`std::is_same`](https://en.cppreference.com/w/cpp/types/is_same). Even without relying on other standard library facilities (they use `std::true/false_type`), its just a couple of lines https://godbolt.org/z/dP8MMr7az. I have zero experience with looking at implementations of standard library, but I can imagine that it isnt easy to read – 463035818_is_not_an_ai Apr 26 '21 at 18:17
  • Yes I have seen the output post compile and that is expected. I dont know how to see the intermediate stages (if at all possible) to see the code that should look like a comparision between two hash values and then that turns into a constant when it does the 'final' compile. I will have a look at cppreference.com for some code. Ta – StoneMonkeyMark Apr 26 '21 at 18:43
  • @StoneMonkeyMark maybe thats your misunderstanding: `std::is_same` is not doing anything at runtime, its all compile time. There is also no comparison of hashes, it uses the normal rules of template specialization to pick the false specialization for different types and the true one for the same types. – 463035818_is_not_an_ai Apr 26 '21 at 18:53
  • I agree but the compiler creates hash values for all the types and uses them during the compile. This is can then be tapped into. But yes I just saw the obvious and created is_same just using the template approach. – StoneMonkeyMark Apr 26 '21 at 19:41
  • I got this to work but its not quite the same. I want the struct to evaluate to false?! `template struct is_same { constexpr operator bool () { return false; }; }; template struct is_same { constexpr operator bool () { return true; }; }; ` and this `if constexpr (is_same())`. – StoneMonkeyMark Apr 26 '21 at 19:42
  • @StoneMonkeyMark `is_same()` creates an object, you still need to call the operator. Well yes, dont get me wrong, but this is too much for comments – 463035818_is_not_an_ai Apr 26 '21 at 19:44
  • Yes you are going to hate this one (lol). But the code works by return the bool from the function. See below for it in full. It does look nicer than the other solutions. – StoneMonkeyMark Apr 26 '21 at 19:51
0

Using the struct approach from @largest this is another solution candidate for the best idiom for this use case. I think I still like the CRTP option due to its symetric way of calling the other parts functions and members. hash_type()

In the code below there must be better names for hash_table_functions amd table as these should be invisible redirections I want say _{} and _ but the underscore police will come after me (lol).

#include <stdio.h>
#include <stdint.h>

template <typename key_t>
struct hash_table_struct;

template <typename key_t, typename hash_table_functions = hash_table_struct<key_t>>
class hash_table {
public:
    int& operator[](key_t key)
    {
        return hash_table_functions{}.get_value(*this, key);
    };
    size_t hash(uint32_t key)
    {
        return key % 10;
    }
    int m_storage[10];
};

template <typename key_t>
struct hash_table_struct {
    int& get_value(hash_table<key_t>& table, key_t key)
    {
        uint32_t hashable = (uint32_t)key;
        size_t index = table.hash(hashable);
        return table.m_storage[index];
    }
};

template <>
struct hash_table_struct<const char*> {
    int& get_value(hash_table<const char*>& table, const char* key)
    {
        uint32_t hashable = (uint32_t)key[0];
        size_t index = table.hash(hashable);
        return table.m_storage[index];
    }
};

#endif

int main() {
    class hash_table<const char*> a;
    class hash_table<float> b;
    a["word"] = 3;
    b[4.5f] = 14;
    printf("%d %d", a["word"], b[4.5f]);
    return 0;
}

  • as you are looking for idioms, you might be interested in [policy based design](https://en.wikipedia.org/wiki/Modern_C%2B%2B_Design). The "modern" prefix is a bit worn out in the context of C++, but even after 20 years the books is worth a read – 463035818_is_not_an_ai Apr 24 '21 at 22:00
  • actually it looks like the example on the wiki page is closer to what you were looking for than my answer, though I also would have to read more to understand it in details. – 463035818_is_not_an_ai Apr 24 '21 at 22:07
0

The constexpr version Which only works on C++17. Emululated is_same_v to keep to the original rules. This was useful as it taught me how that all works. This results in much cleaner and easier to read code.

// C++ 17 required
#include <stdio.h>
#include <stdint.h>

template<class type1, class type2>
struct is_same
{
    static constexpr bool _value = false;
    constexpr operator bool() const noexcept { return _value; }
};

template<class type1>
struct is_same<type1, type1>
{
    static constexpr bool _value = true;
    constexpr operator bool() const noexcept { return _value; }
};

template<class type1, class type2>
inline constexpr bool is_same_v = is_same<type1, type2>::_value;

template <typename key_t>
class hash_table {
public:
    int& operator[](key_t key)
    {
        return get_value(key);
    };
    size_t hash(int key)
    {
        return key % 10;
    }
    int m_storage[10]{ 0 };
    int& get_value(key_t key)
    {
        uint32_t hashable;
        if constexpr ( is_same_v<key_t, const char*> )
        {
            hashable = (uint32_t)key[0];
        }
        else
        {
            hashable = (uint32_t)key;
        }
        size_t index = hash(hashable);
        return m_storage[index];
    }
};

int main()
{
    class hash_table<const char*> a;
    class hash_table<float> b;
    a["word"] = 3;
    b[4.5f] = 14;
    printf("%d %d", a["word"], b[4.5f]);
    return 0;
}