Although the answer has already been given, I think it's worth explaining why these inaccuracies occur.
The IEEE 754 standard ( here and here ), which defines floating-point numbers such as float
or double
(Java, C, C ++, C #), or general JavaScript numbers, treats the numbers internally in a scientific notation, whose base is always 2.
As the base is 2, and the mantissa should always be greater than or equal to 1 and smaller than the base, then it ended up being set to 1.
So:
- Number 4 is treated as 1.0 × 2 2
- Number 10 is treated as 1.25 × 2 3
- The number 6.25 is treated as 1.5625 × 2 2
To store this value, however, the precision is not infinite and is limited to a specific amount of bits. float
uses 32 bits and double
(or the numbers in JavaScript) uses 64 bits, divided as follows:
32 bits: s|eeeeeeee|23 × m
(1 bit for the signal, 8 for the exponent and 23 for the mantissa)
64 bits: s|eeeeeeeeeee|52 × m
(1 bit for the signal, 11 for the exponent and 52 for the mantissa)
As the base is set to 1, the mantissa stores only its fractional part.
To convert from binary notation, to a decimal representation, each bit in the mantissa must be multiplied by a negative 2 power. The first bit must be multiplied by 2 -1 , the second by 2 -2 , and so on.
With this, a mantissa equal to 10010000000000000000000
(using 23 bits for simplicity), when converted to decimal becomes:
2 -1 + 2 -4 = 0.5 + 0.0625 = 0.5625
As the number 1 is implicit, then this mantissa is really worth 1.5625
From there various problems arise with seemingly simple numbers.
The number 3.2, for example. In scientific notation with base 2, it becomes 1.6 × 2 1 .
So far, without problems, however, the problem arises at the time of converting this mantissa to binary. Its fractional part should be representable through a sum of negative powers of base 2. It turns out that 0.6 can not be represented by a finite sum of negative powers of base 2. Notice its binary representation (using 23 bits for simplicity):
10011001100110011001100
If split properly, it can be seen that it is a% re_de%:
1001
The last repetition, however, is truncated to 1001 1001 1001 1001 1001 100
. However, even if it were not truncated, the sum of these powers would still not be 0.6. Similar to what happens when you divide 1 by 3.
For a computer using data type 100
, the number 3.2 is internally stored as 3.1999998092651367 (approx.).
With float
, the only difference is in the number of 9's that would follow 1 (there would be 9 more's). But it would still be "wrong."
Now, eventually, apparently "strange" numbers for us could be properly stored. For example, 3.136671875.
Transformed into scientific notation: 1.2841796875 × 2 2 .
Despite the seemingly large number of decimal places, this number is perfectly storable in a variable double
or float
, since 0,2841796875 is a sum of four negative powers of 2:
2 -2 + 2 -5 + 2 -9 + 2 -10
In 23-bit binary format: double
If you want to test other numbers, you can use an interactive online material that I make available to my students: IEEE 754 Floating Point .
And here link has four tutorials teaching to do this process manually.
There, I use the 32-bit format, but you already have an idea of how the 64-bit format would work.