Help with understanding hash usage error

0

I'm doing a program to read the words of a text file, generating a hash code to use as an index that will save the word in a vector that fits up to 1000 words.

Function that generates the hash:

int a, i,j,soma = 0;
i = strlen(str);
j = 1;
char * c;
for(a = 0; a< i; a++){
    soma += (int)str[a] * pow(7, j);
    j++;
}
soma = soma%1000;
printf("Hash: %d\n", soma);
return soma;

A% of Vector%:

typedef struct vetor{  
    char * palavra;  
}Vetor;

And the code:

char word[1000];
Vetor * vet = (Vetor *)malloc(1000*sizeof(Vetor));

FILE * file;
file = fopen("teste.txt", "r");


while(fgets(word, 1000, file) != NULL){
    char * p;
    p = strtok(word, " \n \r \t");
    while(p != NULL){
        strupr(p);
        int hashPos;
        hashPos = geraHash(p);

        printf("Palavra: %s  Hash: %d\n",p, hashPos);
        printf("\n");

        vet[hashPos].palavra = p;


        p = strtok(NULL, " \n \r \t");
    }
}

    printf(" %s " , vet[670].palavra);
    printf(" %s " , vet[801].palavra);
    printf(" %s " , vet[867].palavra);
    printf(" %s " , vet[846].palavra);


fclose(file);

The test.txt text file contains only the lines: 1st: "abc def" and 2nd: "asd cd". Output:

I'm not understanding why the same word appears in two different positions generated by the hash function. Could anyone tell me where I'm going wrong?

    
asked by anonymous 12.10.2018 / 01:26

1 answer

0

The problem is related to how strtok works and that you are saving the original pointer returned by strtok .

Consider the reservation that the documentation does regarding the string passed to strtok :

  

Notice that this string is modified by being broken into smaller strings (tokens).

In other words, the last string is modified, so when you save the word:

vet[hashPos].palavra = p;

You are assigning the pointer of a string that will be modified by strtok .

The solution is to create a duplicate string from the one on which it goes. There is already a function to do this, strdup , which allocates the space required for the string with malloc . This implies that if you no longer require these strings, you must free on them to have no memory leak.

You just need to change the statement that saves the word to:

vet[hashPos].palavra = strdup(p);

Of course you could duplicate by hand at the cost of malloc , strlen and strcpy :

vet[hashPos].palavra = malloc(sizeof(strlen(p) + 1)); // +1 para o terminador
strcpy(vet[hashPos].palavra, p);

Example running on my machine:

    
12.10.2018 / 13:47