How can -1 be greater than 4?

40

How can this code run this way ?

#include <stdio.h>

int main(void) {
    printf("tamanho de um inteiro: %d\n", sizeof(int));
    if(sizeof(int) > -1) {
        printf("4 é maior que -1");
    } else {
        printf("-1 é maior que 4");
    }
    return 0;
}

Output:

-1 é maior que 4

Intuitively -1 should be less than 4 always, why does this type of "error" occur? Is not the computer a machine that does not go wrong?

    
asked by anonymous 21.01.2015 / 13:40

2 answers

49

Even though my mother and some programmers find that the computer makes its own decisions, it is only able to do what humans determine.

Of course a computer can produce wrong results without a human erring using it. But this means that a human projected the computer or at least some component of it in the wrong way. Either the specification has determined that the error will be possible, and whoever is using that hardware needs to be aware of this and take appropriate steps to do so do not bring unwanted results. In practice, errors occur more in the same software.

Programmers make many mistakes, often because they do not understand every aspect of what they are doing. This is normal. There is no one who knows everything, even if it's all about software development.

This specific "problem" occurs because of the type difference used in the code. It is not very obvious but if you look in the specification you will see that the sizeof operator returns a value of type unsigned int , plus specifically a size_t and the comparison in if is being done with signed int or simply int . That is, it is comparing a type that has a signal with another that does not. This is why there is an implicit conversion of the signaled type to the unsigned type. In this conversion there is a change in the interpretation of the given.

Knowing that the operator return used is an unsigned integer and that implicit cast occurs and even though a negative value when converted to an unsigned type begins counting as large as possible and goes reducing the negative - since it ignores the bit of the signal as a signal and considers it as part of the number - it is easy to understand what is happening. And the problem is just a misinterpretation of a human. The computer did as they were told.

It's easier to understand by printing the -1 unsigned:

#include <stdio.h>

int main(void) {
    printf("tamanho de um inteiro: %d\n", sizeof(int));
    printf("-1 com cast: %u\n", -1);
    if(sizeof(int) > -1) {
        printf("4 é maior que -1");
    } else {
        printf("-1 é maior que 4");
    }
    return 0;
}

So the message should be "4294967295 is greater than 4".

See running on ideone .

Some people will take advantage of this to criticize implicit conversion. In fact, it is bad when it is expected that the language will be used by programmers who do not fully understand every operation of what it is using or of who usually has inattentions. But I think it's more help that gets in the way.

In a way this refers to what I answered in another question . The fact that you do not know the type that an expression produces is the real problem of the code. Leaving everything explicit would help avoid some problems but leave the code longer and redundant adding an unnecessary detail. Ironically, this can be beneficial in languages that target programmers who do not care about details.

Conclusion

Learn the types of any subexpression you are computing. And make sure you're using the right types. In this case, if the intent is to actually compare the type size with flagged integer, then you should ensure that the sizeof(int) result is int through a cast . So:

#include <stdio.h>

int main(void) {
    printf("tamanho de um inteiro: %d\n", (int)sizeof(int));
    if((int)sizeof(int) > -1) {
        printf("4 é maior que -1");
    } else {
        printf("-1 é maior que 4");
    }
    return 0;
}

See working on ideone and on CodingGround .

This question and answer couple was inspired by a Reddit discussion that I found interesting show.

    
21.01.2015 / 13:40
27

Actually sizeof returns a size_t type, and should be of type unsigned. The problem occurs in the signed binary conversion to unsigned binary, in case the programmer is not careful. Conversion from a binary number to a negative decimal integer like -1, for example, is done in C by the complement of two of its positive equivalent.

Example of how the computer sees a negative number: Take the number 1 - positive - in binary form with the size of 4 bytes (32 bits, or 32 binary digits):

00000000 00000000 00000000 00000001 -> binário de 1
11111111 11111111 11111111 11111110 -> complemento de um (inverte-se os bits)
                                + 1 -> complemento de dois (soma 1 ao complemento)
___________________________________
11111111 11111111 11111111 11111111 -> complemento de dois equivalendo a -1

The problem occurs when the computer treats a number that should be negative as positive. For example, the decimal number -1 above, treated as positive is recognized by the computer as 4294967295 decimal, which obviously is greater than the number 4 of its same type.

Considering this, one must be very careful of how the computer and the programming language handles the implicit conversion of types. Study enough casting to make sure you always do it safely.

    
29.01.2015 / 17:35