I was working on a project that needed fast string search on a large collection of string values. I decided to use Trie for search and this approach was fast. This is part of that project that's relevant to my question:
class TTrieNode{
public:
char c;
bool data;
TTrieNode *left, *mid, *right;
TTrieNode(){
left = mid = right = NULL;
c = data = 0;
}
};
class TTrie{
private:
TTrieNode *root;
TTrieNode *insert(TTrieNode*n, char *s, int idx){
char ch = s[idx];
if(!n){
n = new TTrieNode();
n->c = ch;
}
if(ch < (n->c)){
n->left = insert(n->left, s, idx);
}else if(ch > (n->c)){
n->right = insert(n->right, s, idx);
}else if(idx+1 < strlen(s))
n->mid = insert(n->mid, s, idx+1);
else
n->data = true;
return n;
}
public:
TTrie() {
root = NULL;
}
void insert(char *s) {
root = insert(root, s, 0);
}
};
Everything was good until we tested my Trie on the real data. Based on my calculation on the number of nodes and the amount of space each node takes, it should have taken ~40GBs of RAM, but to my surprise it took ~70GBs. At first I thought this was because of memory alignment to each node(just a raw guess!), so I used __attribute__((packed, aligned(1)))
with my TTrieNode
definition!
Using this didn't make big of a difference. After a lot of tests I used some manual memory allocation. So instead of calling new
each time I want to allocate memory to a new node, I allocated ~50GBs of RAM in the beginnig of my program and used the following custom new function instead:
TTrieNode *memLoc;
int memIdx;
void initMemory(){
memLoc = (TTrieNode*) malloc(MAXNODES * sizeof(TTrieNode));
memIdx = 0;
}
TTrieNode*myNew(){
memLoc[memIdx].left = memLoc[memIdx].right = memLoc[memIdx].mid = NULL;
memLoc[memIdx].c = memLoc[memIdx].data = 0;
return &memLoc[memIdx ++];
}
This was very surprising but this time, the program took EXACTLY the amount of memory I was expecting!
Now my questions are these:
Why is some extra memory for each new (malloc)
? Is there some kind of pointer in the kernel/user level for each memory allocation? I haven't tested my code in windows(or any other operating system) but would like to know if there is some similar behavior on those operating systems as well.