In my chess engine codebase I'm using a very big hash table, the hashtable size can be up to 128 GB. The hast table is a big array of bucket of size 4. The code to manage this big table is done with STL std::vector. I'm happy with the performance of the code but I have some problem when initializing the structure.
the structure of the hash table is the following:
class ttEntry
{
private:
signed int key:32; /*! 32 bit for the upper part of the key*/
signed int packedMove:16; /*! 16 bit for the move*/
signed int depth:16; /*! 16 bit for depth*/
signed int value:23; /*! 23 bit for the value*/
signed int generation:8; /*! 8 bit for the generation id*/
signed int staticValue:23; /*! 23 bit for the static evalutation (eval())*/
signed int type:3; /*! 2 bit for the type of the entry*/
/* 144 bits total = 16 bytes*/
public:
explicit ttEntry(unsigned int _Key, Score _Value, unsigned char _Type, signed short int _Depth, unsigned short _Move, Score _StaticValue, unsigned char _gen): key(_Key), packedMove(_Move), depth(_Depth), value(_Value), generation(_gen), staticValue(_StaticValue), type(_Type){}
explicit ttEntry(){}
...
...
};
using ttCluster = std::array<ttEntry, 4>;
class transpositionTable
{
private:
std::vector<ttCluster> _table;
....
....
}
my code for allocating the space is the following:
uint64_t transpositionTable::setSize(unsigned long int mbSize)
{
uint64_t size = (uint64_t)((((uint64_t)mbSize) << 20) / sizeof(ttCluster));
_elements = size;
_table.clear();
_table.shrink_to_fit();
try
{
_table.reserve(_elements);
_table.resize(_elements); // big bottleneck
}
catch(...)
{
std::cerr << "Failed to allocate " << mbSize<< "MB for transposition table." << std::endl;
exit(EXIT_FAILURE);
}
return _elements * 4;
}
To initialize the table with 128GB of ram 108 seconds are needed. I'm not interested in initializing the memory with known value but only to allocate the space and have a long enough std::vector.
I know I can rewrite the code with good old C code and malloc, but I'd like to work with modern std::vector.
Any idea on how to speedup the code and where I'm doing it wrong?
following @MarcGlisse and @Bob__ hint I modified my code to:
//
//https://stackoverflow.com/questions/21028299/is-this-behavior-of-vectorresizesize-type-n-under-c11-and-boost-container/21028912#21028912
//
// Allocator adaptor that interposes construct() calls to
// convert value initialization into default initialization.
template <typename T, typename A=std::allocator<T>>
class default_init_allocator : public A {
typedef std::allocator_traits<A> a_t;
public:
template <typename U> struct rebind {
using other =
default_init_allocator<
U, typename a_t::template rebind_alloc<U>
>;
};
using A::A;
template <typename U>
void construct(U* ptr)
noexcept(std::is_nothrow_default_constructible<U>::value) {
::new(static_cast<void*>(ptr)) U;
}
template <typename U, typename...Args>
void construct(U* ptr, Args&&... args) {
a_t::construct(static_cast<A&>(*this),
ptr, std::forward<Args>(args)...);
}
};
class transpositionTable
{
private:
std::vector<ttCluster, default_init_allocator<ttCluster>> _table;
...
...
now the resize if faster (less than 1 second for 8GB) and all the elements of the board are 0 filled after a resize.
Thank you guys