Although the answer has already been given, I think it's worth explaining why these inaccuracies occur.
The IEEE 754 standard ( here and here ), which defines floating-point numbers such as
As the base is 2, and the mantissa should always be greater than or equal to 1 and smaller than the base, then it ended up being set to 1.
- Number 4 is treated as 1.0 × 2 2
- Number 10 is treated as 1.25 × 2 3
- The number 6.25 is treated as 1.5625 × 2 2
To store this value, however, the precision is not infinite and is limited to a specific amount of bits.
float uses 32 bits and
s|eeeeeeee|23 × m (1 bit for the signal, 8 for the exponent and 23 for the mantissa)
s|eeeeeeeeeee|52 × m (1 bit for the signal, 11 for the exponent and 52 for the mantissa)
As the base is set to 1, the mantissa stores only its fractional part.
To convert from binary notation, to a decimal representation, each bit in the mantissa must be multiplied by a negative 2 power. The first bit must be multiplied by 2 -1 , the second by 2 -2 , and so on.
With this, a mantissa equal to
10010000000000000000000 (using 23 bits for simplicity), when converted to decimal becomes:
2 -1 + 2 -4 = 0.5 + 0.0625 = 0.5625
As the number 1 is implicit, then this mantissa is really worth 1.5625
From there various problems arise with seemingly simple numbers.
The number 3.2, for example. In scientific notation with base 2, it becomes 1.6 × 2 1 .
So far, without problems, however, the problem arises at the time of converting this mantissa to binary. Its fractional part should be representable through a sum of negative powers of base 2. It turns out that 0.6 can not be represented by a finite sum of negative powers of base 2. Notice its binary representation (using 23 bits for simplicity):
If split properly, it can be seen that it is a% re_de%:
The last repetition, however, is truncated to
1001 1001 1001 1001 1001 100 . However, even if it were not truncated, the sum of these powers would still not be 0.6. Similar to what happens when you divide 1 by 3.
For a computer using data type
100 , the number 3.2 is internally stored as 3.1999998092651367 (approx.).
float , the only difference is in the number of 9's that would follow 1 (there would be 9 more's). But it would still be "wrong."
Now, eventually, apparently "strange" numbers for us could be properly stored. For example, 3.136671875.
Transformed into scientific notation: 1.2841796875 × 2 2 .
Despite the seemingly large number of decimal places, this number is perfectly storable in a variable
float , since 0,2841796875 is a sum of four negative powers of 2:
2 -2 + 2 -5 + 2 -9 + 2 -10
In 23-bit binary format:
If you want to test other numbers, you can use an interactive online material that I make available to my students: IEEE 754 Floating Point .
And here link has four tutorials teaching to do this process manually.
There, I use the 32-bit format, but you already have an idea of how the 64-bit format would work.