0

I'm trying to implement a hash table in C. I have a csv file, which only consists of a list of surnames on separate lines each. e.g.

Synott
O Neill
Potter

that I am to read from. I then use a hash function, called hash1, to convert these strings into an integer which would give me my index for my hash table. I am then to store the frequency of the occurrence of each name. However, I am getting a segmentation fault which I have deduced is coming from my

void insert(struct individual *p)

function. I'm clueless as to how to solve this and have only started practising C.

#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#define maxSize 500

/* Hash function that returns my key as an integer, giving me my index */
int hash1(char *s){
    int hash = 0;
    while(*s){
        hash = hash + *s;
        s++;
    }
return hash;
}

/* Item to be stored */
struct individual{
    int frequency; /* Value to be stored */
    char name[50]; /* This is will be my key */
};

/* The hash table */ 
struct individual *hashArray[maxSize];

/* Initialising hash table */
void initArray(){
    for (int i = 0; i < maxSize; i++){
        hashArray[i]->frequency = 0;
        hashArray[i]->name[i] = 0;
    }
}

/* Function to load the names */
int next_field(FILE *f, char *buffer, int max){
    int i = 0, end = 0;
    for(;;){
        buffer[i] = fgetc(f);
        if(buffer[i] == '\n' || feof(f)){ end = 1; break; }
    }
    buffer[i] = 0;
    return end;
};

/* Loading the names into structs - Done */
void reader(FILE *f, struct individual *n){
    char buf[50];
    next_field(f, n->name, maxSize);
};

/* Adding to the hash table */
void insert(struct individual *p){
    struct individual *person =  malloc(sizeof(struct individual));
    int index = hash1(p->name) % maxSize;

    // The issue is coming from this line here:
    int primaryIndex = hash1(hashArray[index]->name) % maxSize;  


    /* Linear Probing */
    printf("Do I get to here\n");
    while(hashArray[index] != NULL){

        if(primaryIndex == index){
           hashArray[primaryIndex]->frequency++;  
        } else{
            ++index;  /* Going to next block */ 
        }
        index %= maxSize; /* Looping through the hash table */
    }
    hashArray[index] = person;
};

void display(struct individual *duine){
    printf("%s\n", duine->name);
};

int main(int argc, char *argv[]){
    FILE *list;
    struct individual in;
    //initArray();

    /* Opening file */
    if( argc < 2 ) { 
        printf("Also include csv file name.\n"); 
        return EXIT_FAILURE; 
    }

    /* Checking if file is found */
    list = fopen(argv[1], "r");
    if(!list) { 
        printf("File not found. %s\n", argv[1]); 
        return EXIT_FAILURE; 
    }

    while(!feof(list)){
        reader(list, &in);
        insert(&in);
        display(&in);
    }

    fclose(list);
    return EXIT_SUCCESS;
}

What I'm trying to do here is to compare two indices, one from the struct p being passed into this function and one from the hash table at this index. If they are the same, I wish to increase the frequency count stored there by 1. If I do remove this line, the rest of my code runs fine.

Thank you very much

David C. Rankin
  • 81,885
  • 6
  • 58
  • 85
JsM
  • 11
  • how do you know that object under index exist? (is not null): int primaryIndex = hash1(hashArray[index]->name) % maxSize; – JosiP Oct 13 '21 at 15:11
  • 1
    Tip: `maxSize` better as a [prime](https://stackoverflow.com/a/32915815/2410359) than 500. With `index % 500`, code is over emphasizing the lower 2 bits of `index`. With a prime, all index bits contribute about equally. – chux - Reinstate Monica Oct 13 '21 at 15:24
  • End-of-file detection messed up. Also, return value of `next_field(f, n->name, maxSize)` not used. – chux - Reinstate Monica Oct 13 '21 at 15:29
  • `for(;;){ buffer[i] = fgetc(f); if(buffer[i] == '\n' || feof(f)){ end = 1; break; } }` is suspicious. I'd expect a `i++` someplace. – chux - Reinstate Monica Oct 13 '21 at 15:31
  • 1
    You will want to look at [Why is while ( !feof (file) ) always wrong?](https://stackoverflow.com/questions/5431941/why-is-while-feoffile-always-wrong) .. – David C. Rankin Oct 13 '21 at 17:26
  • You are using [open addressing, linear probing, and a static table](https://en.wikipedia.org/wiki/Hash_table), but you store pointers instead of values, (which is reasonable, but you kind of get the worse of both.) `initArray` seems to be thinking that it's value-based. Your hash function is poor; CLT will clump them by length. – Neil Oct 13 '21 at 20:13

0 Answers0