0

I am working on a project where we are to write a hash table using a given header file. So far everything is working (my get is bugged but I can at least fix that) but I have run into a problem. I have no idea how to rehash an open addressing hash table. I'll provide my structs, my create function, and my put function.

/// The "buckets" in the table
typedef struct Entry_t {
void* key;     ///< key must be "hashable"
void* value;   ///< value can be any type
} Entry;

/// The hash table
typedef struct Table_t {
    Entry** table;    ///< table of Entry pointers
    size_t size;      ///< number of entries in table
    size_t capacity;  ///< total capacity of table
    long (*hash)(void* key);  ///< hash function for key
    bool (*equals)(void *key1, void* key2); ///< equals function for key comparison
    void (*print)(void *key, void* value);  ///< print function for table dump debug
    size_t collisions;      ///< number of collisions throughout life of hash table
    size_t rehashes;        ///< number of rehashes throughout life of hash table
} Table;

Table* create(long (*hash)(void* key), bool (*equals)(void* key1, void* key2), 
void (*print)(void* key1, void* key2))
{
    Table* t = (Table*)malloc(sizeof(Table));
    if (t == NULL)
    {
        return NULL;
    }
    t->size = 0;
    t->capacity = INITIAL_CAPACITY;
    t->collisions = 0;
    t->rehashes = 0;
    t->table = (Entry**)calloc(t->capacity, sizeof(Entry*));
    t->hash = hash;
    t->equals = equals;
    t->print = print;
    return t;
}

The way I have my code written, t->table[index] is null, so I have to malloc space for it right before I insert the values. Here is my put function.

void* put(Table* t, void* key, void* value)  // Rehashing needs to be completed
{
        int index = t->hash(key) % t->capacity;
        void* old_value = NULL;
        int check = 0;
        if (t->table[index] != NULL)
        {
                while (check == 0)
                {
                        if (t->table[index] != NULL)
                        {
                                if (t->equals(t->table[index]->key, key) == 0)
                                {
                                        index = (index + 1) % t->capacity;
                                }
                                else  // Meaning the keys match, then leave
                                {
                                        check = 1;
                                }
                        }
                        else  // Empty node, meaning the key is NOT in the table
                        {
                                check = 1;
                        }
                }
        }
        if (t->table[index] != NULL) // Meaning the key is in the table, so update the value
        {
                old_value = t->table[index]->value;
                t->table[index]->value = value;
        }
        else
        {
                t->size++;
                if ((float)t->size / t->capacity > LOAD_THRESHOLD)  // Time to rehash 
                {
                     // Not sure what to do here
                     // I was told that I must use realloc, but I am unsure how
                }
                t->table[index] = malloc(sizeof(Entry));
                t->table[index]->key = key;
                t->table[index]->value = value;
                t->print(key, value);
        }
        return old_value;
}

At first, inside the if test to see if I should rehash, I wanted to realloc the memory with

t->table = (Entry**)realloc((t->capacity * RESIZE_FACTOR), sizeof(Entry));

However this gave me the error

error: passing argument 1 of 'realloc' makes pointer from integer without a cast [-Werror]

Which I am unsure how to fix/approach.

There is also the matter of reassigning key/value pairs to new index's, as I believe once the capacity changes, the current index's in the table are not representative anymore. Just like realloc, I am unsure how to do this.

Any help is greatly appreciated, thank you!

2 Answers2

0

size_t is an integer. The 1st argument passed to realloc should be a pointer.

void *realloc(void *ptr, size_t size);

You need to pass a pointer to realloc, not t->capacity

Ben
  • 179
  • 1
  • 3
  • I see, i'll fix that right away. Thank you :D How should I reassign the old Entries using the new capacity, though? – Ryan Morrissey Oct 29 '15 at 02:09
  • Maybe you just want do the following? temp_table = (Entry**)realloc((t->table, sizeof(Entry) * RESIZE_FACTOR); t->table = remp_table; – Ben Oct 29 '15 at 02:16
0

"There is also the matter of reassigning key/value pairs to new indexes, as I believe once the capacity changes, the current indexes in the table are not representative anymore. Just like realloc, I am unsure how to do this."

That's correct, the hash function must change if you want to make use of the new capacity, thus the indexes are not representative anymore. That is the reason why malloc() is more appropriate than realloc() for this case.

A simple resize of the hashtable can be done by these steps:

  1. Update the capacity
  2. Update the hash function (make use of the new capacity)
  3. Allocate a new table of pointers with the new capacity using malloc.
  4. Copy one by one all the pointers of the last table to the new table (at the appropriate locations given by the updated hash function)
  5. Free the memory of the previous table of pointers (Do not free the entry pointers themselves though!)

Comments:

  • realloc() is unuseful in this case because you are not interested in keeping the location of the pointers since the hash function has changed.
  • Resize should be done sparingly (A malloc call, a bunch of pointer copies...) So increment the capacity at least 1.5x or 2x every resize.
  • A nice and simple hash table implementation is show in Section 6.6 of The C Programming Language (2nd Edition).

For better references, you can check how this is done by leading programmers, for example the dictionary implementation of Python (using C): How are Python's Built In Dictionaries Implemented? and source code.

MrIo
  • 108
  • 1
  • 10