Let me start off by saying I know very basic, basic C. I have implemented a Hash Table using the code from the tutorialspoint website.
My program is reading in positive whole numbers from a text file, putting the number and the number of times it occurs in the file as the key and value respectively. It works nicely actually, but I face a big problem: I need to know the size of the hash table beforehand. I noticed that when I skip over a negative number, it weirdly messes up the structure of the table.
Here is my code (if using a size of 20):
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <stdbool.h>
#define SIZE 20
struct DataItem {
int key;
int data;
};
struct DataItem* hashArray[SIZE];
struct DataItem* item;
int hashCode(int key) {
return key % SIZE;
}
struct DataItem *search(int key) {
int hashIndex = hashCode(key);
while(hashArray[hashIndex] != NULL) {
if(hashArray[hashIndex]->key == key)
return hashArray[hashIndex];
++hashIndex;
hashIndex %= SIZE;
}
return NULL;
}
void insert(int key,int data) {
struct DataItem *item = (struct DataItem*) malloc(sizeof(struct DataItem));
item->data = data;
item->key = key;
int hashIndex = hashCode(key);
while(hashArray[hashIndex] != NULL && hashArray[hashIndex]->key != -1) {
++hashIndex;
hashIndex %= SIZE;
}
hashArray[hashIndex] = item;
}
void display() {
int key_index;
for(key_index = 0; key_index<SIZE; key_index++) {
if(hashArray[key_index] != NULL) {
printf("%d has occurred %d time(s)\n", hashArray[key_index]->key, hashArray[key_index]->data);
}
}
printf("\n");
}
void sort(size_of_struct) {
int i, tempkey, tempdata;
if(size_of_struct == 1) {
return;
}
for(i=0; i<(size_of_struct-1); i++) {
if(hashArray[i] != NULL) {
if(hashArray[i]->key > hashArray[i+1]->key) {
tempkey = hashArray[i]->key;
hashArray[i]->key = hashArray[i+1]->key;
hashArray[i+1]->key = tempkey;
tempdata = hashArray[i]->data;
hashArray[i]->data = hashArray[i+1]->data;
hashArray[i+1]->data = tempdata;
}
}
}
sort(size_of_struct-1);
}
int main() {
FILE *myFile;
myFile = fopen("test.txt", "r");
int *numberArray = NULL;
int i, count, number, current_num, temp;
count = number = temp = 0;
numberArray = malloc(sizeof(int)*500);
while(fscanf(myFile, "%d", &number) > 0) {
if(number >= 0 && number <= 100) {
numberArray[count] = number;
count++;
}
}
for(i=0; i < count; i++)
{
item = search(numberArray[i]);
if(item == NULL) {
insert(numberArray[i], 1);
} else {
item = search(numberArray[i]);
item->data += 1;
}
free(numberArray);
numberArray = malloc(sizeof(int)*500);
}
display();
sort(SIZE);
}
If I run this code, I get the following result:
80 has occurred 1 time(s)
100 has occurred 1 time(s)
2 has occurred 1 time(s)
3 has occurred 2 time(s)
84 has occurred 1 time(s)
12 has occurred 1 time(s)
33 has occurred 1 time(s)
99 has occurred 3 time(s)
Segmentation fault: 11
If I use a SIZE of 8 (which is the total number of numbers that I am using from the file), then I get the following result:
80 has occurred 1 time(s)
33 has occurred 1 time(s)
2 has occurred 1 time(s)
99 has occurred 3 time(s)
3 has occurred 2 time(s)
12 has occurred 1 time(s)
100 has occurred 1 time(s)
84 has occurred 1 time(s)
2 has occurred 1 time(s)
3 has occurred 2 time(s)
12 has occurred 1 time(s)
33 has occurred 1 time(s)
80 has occurred 1 time(s)
84 has occurred 1 time(s)
99 has occurred 3 time(s)
100 has occurred 1 time(s)
Upon further investigation, I noticed this happening in my display function:
key_index: 0
80 has occurred 1 time(s)
key_index: 1
100 has occurred 1 time(s)
key_index: 2
2 has occurred 1 time(s)
key_index: 3
3 has occurred 2 time(s)
key_index: 4
84 has occurred 1 time(s)
key_index: 5
key_index: 6
key_index: 7
key_index: 8
key_index: 9
key_index: 10
key_index: 11
key_index: 12
12 has occurred 1 time(s)
key_index: 13
33 has occurred 1 time(s)
key_index: 14
key_index: 15
key_index: 16
key_index: 17
key_index: 18
key_index: 19
99 has occurred 3 time(s)
Segmentation fault: 11
I am not entirely sure what is happening. I know why the segmentation fault is occurring: it is trying to reference a value that is null at that index. What I don't know is why (inside of the display function) does it skip over the key_index values from 5-11 and 14-18; that is my real concern. I am assuming that because I am not using all of the numbers in the file, the structure is trying to populate it so that there are always 20 spaces. Am I wrong? Is there a way I can fix this?
My test.txt file:
99 2 99
3
80 12 -12 33
3 99 100 1234 84