I don't know if this is a proper way to answer Qs on here. I wanted to post this in a comment but I lack the rep.
But, you know, how about let the profiler and your competition tell you how good or bad your solution is? If your top hotspot is showing a boatload of cache misses, the profiler basically told you how to improve it and make a second draft. If you're competing against a search engine, see how fast they can look up results and then measure yours in comparison. If it's worse, then maybe you need to research more new techniques or look more closely at your hotspots. If it's better, and especially far better, you might have come across a breakthrough in your industry.
For load factors/adapting table size though, I don't even think 1 is that large if you only require 32-bits for an empty node, e.g. If you're going to be wasteful and use 64-bits per node with the convenience of pointers on 64-bit architectures, it's quite a bit more wasteful and you'll typically double your cache misses if you're only storing small keys/values, like just a 32-bit integer in a hash set.
Prime number table sizes are a huge win in all my measurements. I come from image processing so I came with the bias that integer div/modulo is so expensive [we often use power-of-two sizes in image processing to avoid repeated division and modulo we can replace with bitshifts and bitwise AND to access pixel rows of images], but that was totally wrong, and my profiler taught me faster than anyone else could have how wrong I was.
For hash algorithms, I've tried all sorts like DJB2, mmhash, CityHash, ones that use 64-bit hashes and SIMD, etc. I never have cryptographic needs, just want to make the most efficient associative containers. DJB2 was one of the best I've found for my use cases and also ridiculously simple to implement, but usually table size is far more important in my experience than minutely-better hashing algorithms for both linear probing and separate chaining and all other variants.
One thing that seems obvious to me but I'm going to point it out just in case is that if your hash table compares keys a lot or first compares hashes for equality before keys, you will probably benefit from storing keys, hashes, and/or values in separate memory blocks (ex: parallel arrays/SoA rep). It's 101 hot/cold field splitting but I've disappointingly found so many implementations disregarding it and just storing keys, values, and possibly hash all interleaved in an AoS rep loading mostly-irrelevant data into cache lines left and right.
Also for people who think separate chaining is slower than linear probing, they're probably thinking in a very serial mindset and with the crude implementations that require 64-bit pointers on 64-bit architectures with buckets that allocate a node one at a time with a general-purpose, variable-length allocator. Separate chaining is way underrated and especially if your hash table handles parallel requests for insertions/deletions/searches and it's clear even by the name, "Probing". That's going to want to read and write to the same shared data across threads. You will generally end up with way more false sharing with a probing solution in those cases, even carefully using techniques like TTAS, even if it might perform best in single-threaded use cases. Computer science courses are getting really backwards these days; we have 4+ cores machines even on full mini PCs costing $300. They teach it like we only have one physical thread still. I implemented one the other day using DJB2 for the hash and prime table sizes that can insert 1024*1024
elements for a map of int32_t
keys and values in under 6ms (faster than std::vector<int>
in C++ which requires 15ms to insert 1024*1024 int32_ts
on my i7-2600K, in spite of the fact that it's requiring twice as much data for both keys and values and doing a bucket search with each insertion to avoid inserting duplicates).
So I hope that wasn't an entirely useless answer but I was going with the vibes. Like I think you lack confidence and measuring more (both with tools like profilers) and against what you're competing against will let you know where your solutions stand in the industry.