HashTable: Segmentation fault when inserting - C

2

I'm trying to implement a hashtable, in which the table would be an array of "buckets" where each contained user information, my code:

#define tam_inicial 23

typedef struct user{
    char nick[6];
    char nome[26];
}user;

typedef struct hashtable{
    int tam;
    user **baldes;
}hashtable;

int tam = tam_inicial;

hashtable * create() {
    hashtable *htable = malloc(sizeof(htable));
    htable->tam = tam_inicial;
    htable->baldes = calloc(tam_inicial, sizeof(htable->baldes));

    return htable;
}

int hash(char *string) {
    int hashVal = 0;
    for( int i = 0; i < strlen(string);i++){
        hashVal += (int)string[i];
    }

    return hashVal % tam;
}

bool isPrime(int num){
    if(num==2){
        return true;
    }
    if(num % 2==0){
        return false;
    }
    for(int i=3; i*i<num; i+=2){
        if(num%i==0){
            return false;
        }
    }
    return true;
}

int Prime_num(int old_size){
    for(int i = old_size; i < old_size * 2 +10; i++){
        if(isPrime(i) && i >= 2*old_size){
            return i;
        }
    }
}

hashtable *resize_HashTable(hashtable *HashTable){
    if(load_factor()){
        int tmp = Prime_num(tam);
        HashTable = realloc(HashTable, tmp * sizeof(user));
        tam = tmp;
    }
    return HashTable;
}

void inserir(hashtable *HashTable, char *nome, char *nick){
    HashTable = resize_HashTable(HashTable);
    int hash_value = hash(nick);
    while(HashTable->baldes[hash_value] != 0 && hash_value < HashTable->tam){
        hash_value++;
    }
    strcpy(HashTable->baldes[hash_value]->nome, name);
    strcpy(HashTable->baldes[hash_value]->nick, nick);
    HashTable->tam = HashTable->tam++;
}

For some reason these 2 lines give segmentation fault:

strcpy(HashTable->baldes[hash_value]->nome, name);
strcpy(HashTable->baldes[hash_value]->nick, nick);

It is as if I could not access the struct user's "nickname" and "nickname" variants for there names / nicks

Any reason for this to happen? Thanks in advance.

    
asked by anonymous 18.05.2018 / 22:37

1 answer

1

Note this section:

HashTable = resize_HashTable(HashTable);
int hash_value = hash(nick);
while(HashTable->baldes[hash_value] != 0 && hash_value < HashTable->tam){
    hash_value++;
}

Suppose that HashTable->tam is 23 and that hash_value returned is -1904468. What will happen? Segmentation fault because you will be accessing HashTable->baldes[hash_value] off the right size.

I think what you have to do is this:

HashTable = resize_HashTable(HashTable);
int hash_value = hash(nick);
int balde = hash_value % HashTable->tam;
if (balde < 0) balde += HashTable->tam;
while (HashTable->baldes[balde] != 0 && balde < HashTable->tam) {
    balde++;
}

But that only solves part of the problem. For if this while reaches the end of the array (especially if it already has started there) and does not find a free place to put the name / nick, it will exit the limit of the array and give a segmentation fault instead of returning to the beginning. So you should do this:

HashTable = resize_HashTable(HashTable);
int hash_value = hash(nick);
int balde_inicial = hash_value % HashTable->tam;
if (balde_inicial < 0) balde_inicial += HashTable->tam;
int balde = balde_inicial;
while (HashTable->baldes[balde] != 0 && balde != balde_inicial - 1) {
    balde++;
    balde %= HashTable->tam;
}

If HashTable gets full, this will be an infinite loop. Therefore, it is important that resize_HashTable ensure that a HashTable full is never returned.

[EDIT]: Okay, the question has been edited and its hash function never returns values like this. However, its hash function should not be based on the start size of the hash table. If the table size changes with resize_HashTable , the hash function does not change together. In addition, you can have multiple tables in memory each with a different size.

Another problem is that if the string is large enough, the hash function will overflow and may return a negative value.

The solution is simply to remove the % tam from the hash function. Who has to worry about this to know in which bucket the data is going to be placed is the hash table and not the hash function that does not know which table is going to be used. The hash function is only concerned with returning an arbitrary (possibly negative) number that represents the string.

    
18.05.2018 / 23:25