How do I implement a memory arena allocator to reduce memory request calls in a trie-based spell checker?

Question

As part of the cs50x course, I made a trie-based dictionary spell checker that passes all of check50's tests as well as valgrind so no memory errors. Now I'm trying to implement a memory bucket to reduce malloc requests.

Here's my code:

dictionary.h

// Declares a dictionary's functionality

#ifndef DICTIONARY_H
#define DICTIONARY_H

#include <stdbool.h>

// Maximum length for a word
// (e.g., pneumonoultramicroscopicsilicovolcanoconiosis)
#define LENGTH 45
#define N 27
// Trie node definition
typedef struct TrieNode
{
    struct TrieNode *children[N];
    bool isEnd;
} TrieNode;

// Prototypes
bool check(const char *word);
bool load(const char *dictionary);
unsigned int size(void);
bool unload(void);

TrieNode *create_trie(void);
bool trie_insert(TrieNode **head, const char *word);
void trie_clear(TrieNode *root_ref);

#endif // DICTIONARY_H

dictionary.c

#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#include "dictionary.h"

// Macros
#define key(c) (c>='a'&&c<='z')?(c-'a'):(c>='A'&&c<='Z')?(c-'A'):26

// word count
unsigned int word_count = 0;

// Root trie node
TrieNode *root;

// Returns true if word is in dictionary, else false
bool check(const char *word)
{
    // Declare tmp ptr to root
    TrieNode *tmp = root;

    for (int i = 0 , n = strlen(word); i < n; i++)
    {
        int t = key(word[i]);
        if (tmp->children[t] == NULL)
        {
            return false;
        }
        tmp = tmp->children[t];
    }
    // Did index values lead to valid terminal?
    return tmp->isEnd;
}

TrieNode *create_trie(void)
{
    TrieNode *result = malloc(sizeof(*result));
    for (int i = 0; i < N; i++)
    {
        result->children[i] = NULL;
    }
    result->isEnd = false;
    return result;
}

// Double ptr allows us to alter the root node
bool trie_insert(TrieNode **head, const char *word)
{
    if (*head == NULL)
    {
        *head = create_trie();
    }

    TrieNode *tmp = *head;

    for (int i = *word++; *word; i = *word++)
    {
        unsigned int t = key(i);
        if (tmp->children[t] == NULL)
        {
            tmp->children[t] = create_trie();
        }
        tmp = tmp->children[t];
    }
    tmp->isEnd = true;
    return true;
}

// Loads dictionary into memory, returning true if successful, else false
bool load(const char *dictionary)
{
    // open dictionary file
    FILE *dict = fopen(dictionary, "rb");
    if (!dict)
    {
        return false;
    }
    // allocate more memory than needed for LENGTH chars
    char *word = malloc(sizeof(char) * LENGTH + 2);
    if (!word)
    {
        fclose(dict);
        return false;
    }
    // Keep reading infile word by word until NULL
    while (fgets(word, LENGTH + 2, dict))
    {
        // Skip all words that exceed length limit
        size_t word_length = strlen(word);
        if (word_length > LENGTH + 1)
        {
            break;
        }

        if (trie_insert(&root, word))
        {
            word_count++;
        }
    }

    // Free buffer and close infile
    free(word);
    fclose(dict);
    return true;
}

bool unload(void)
{
    trie_clear(root);
    return true;
}

void trie_clear(TrieNode *root_ref)
{
    TrieNode *tmp = root_ref;

    for (int i = 0; i < N; i++)
    {
        if (tmp->children[i])
        {
            trie_clear(tmp->children[i]);
        }
    }
    free(tmp);
}

// Returns number of words in dictionary if loaded, else 0 if not yet loaded
unsigned int size(void)
{
    return word_count;
}

My results:


My solution                                             Staff solution

WORDS MISSPELLED:     955                               WORDS MISSPELLED:     955
WORDS IN DICTIONARY:  143091                            WORDS IN DICTIONARY:  143091
WORDS IN TEXT:        17756                             WORDS IN TEXT:        17756
TIME IN load:         0.08                            | TIME IN load:         0.03
TIME IN check:        0.02                              TIME IN check:        0.02
TIME IN size:         0.00                              TIME IN size:         0.00
TIME IN unload:       0.04                            | TIME IN unload:       0.01
TIME IN TOTAL:        0.14                            | TIME IN TOTAL:        0.06

CURRENT OBSTACLE: Right now I think the most plausible explanation for my slower loading time is that I'm performing memory requests one trie node. During my research, I came across a lot of comments recommending some variation of a memory pool solution for reducing the total number of malloc requests made. The idea was to request a bunch of memory upfront and feed individual pieces of it to the insert function as needed. However, many of these comments only included a wiki article link without any further elaboration.

The only working solution I did come across was designed for a hash table spell checker. This implementation requests memory for an large array of hash nodes in a single block, and every subsequent block requested would be connected as a singly linked list.

Here's their hash node definition:

typedef struct node
{
    char word[LENGTH + 1];
    struct node *next;
    unsigned int hash;
} node;

Their memory segment definition:

// adjust this value to suit
#define ARENASIZE       1000

// Structure for slab allocation of 1000 hash nodes as a linked list
typedef struct seg seg_t;
struct seg
{
    seg_t *seg_next;                // next segment
    int seg_count;                  // number of used nodes in this segment
    node seg_node[ARENASIZE];       // nodes in this segment
};

If I'm adapting this structure to my dictionary by substituting the array of hashnodes with an array of trienodes, is there anything I need to do differently or can I just the datatype to TrieNode seg_node[ARENASIZE]?

UPDATE Thanks to Craig Estey, I've successfully added the memory arena allocation function to my existing code. To my surprise this didn't actually reduce the loading time at all but the unloading times were basically reduced to zero which is still a win. Similarly, since I'm now able to use calloc instead of malloc, my lookup times are also consistently equal or faster than the staff solution. Now my spell checker is consistently slower than the staff solution by only around 0.02s to 0.04s

WORDS MISSPELLED:     12544                 WORDS MISSPELLED:     12544
WORDS IN DICTIONARY:  143091                WORDS IN DICTIONARY:  143091
WORDS IN TEXT:        265867                WORDS IN TEXT:        265867
TIME IN load:         0.07                | TIME IN load:         0.03
TIME IN check:        0.24                | TIME IN check:        0.26
TIME IN size:         0.00                  TIME IN size:         0.00
TIME IN unload:       0.00                | TIME IN unload:       0.02
TIME IN TOTAL:        0.32                | TIME IN TOTAL:        0.30

Why are you asking this? Your very last sentence contains the thing that you should try. Did you try it? Your allocation requirements are quite suitable for an arena approach. You only ever add stuff to the Trie, and at the end you tear down the entire thing. So, provided your allocator is efficient (_i.e._ when it allocates a new segment, it puts that at the head of the list so there's never any list searching required), then this is ideal. You'll cut down your allocations by a factor of 1000. Consider using `memset` to zero an entire struct, instead of that loop. — paddy, Dec 19 '22 at 01:58
Wow! Possibly the first time an answer of mine has been linked by someone else :-). Before I saw this, I was going to link to it. Here are two of my trie related answers: [I am trying to initialize a glogal structure(trie_node) as the head of a trie in my main function and am getting memory allocation problems](https://stackoverflow.com/a/62236036/5382650) and [Implement N-ary Tree in c](https://stackoverflow.com/a/62159300/5382650) They don't do the "pool" allocation but the `trienew` and `narynew` functions can be adapted using the `newnode` from the speller. — Craig Estey, Dec 19 '22 at 17:17
Your trie node doesn't have a parent pointer, which might be needed. I'd use that as the "next" in the pool impl. Or, in your case: `children[0]` Just lift the arena related code (e.g. the `seg` struct, `nodelist`, `seglist`, `newnode`, `freenode`, `segunload`) and change `node` into [your] `TrieNode`. It should be pretty close. — Craig Estey, Dec 19 '22 at 17:30
Yeah I was hoping you'd eventually show up in the comment section haha. I like your suggestion about including a pointer to parent node in the trienode typedef and using it to maintain the memory segments as a linked list regardless of whatever the program is doing with the dictionary data. — Jordan Chen, Dec 21 '22 at 04:51
However, I'm still a bit confused about the importance of the freenode function in your final hash table solution. My understanding is that the freenode function was originally used to gather all the "occupied" nodes for swift deletion, but would't this process becomes redundant once you use a memory arena to request and free memory? With that said, can you explain why its inclusion is still a necessary part of your implementation? — Jordan Chen, Dec 21 '22 at 04:51
After some experimenting, I got it to work without using nodelist or freenode but most of the improvements were associated with the reduced unload time which was a surprise since I thought most of my issues stemmed from constantly calling malloc. Still a huge breakthrough so I'm pretty happy about it. — Jordan Chen, Dec 21 '22 at 08:23

How do I implement a memory arena allocator to reduce memory request calls in a trie-based spell checker?

0 Answers0