Complementing Marcos Banik's answer.
A double-floating (64-bit) floating-point number can roughly be summarized in 3 parts:
floating point type double: signal (1bit), order of magnitude (11 bits) and precision (52 bits) >
This can represent orders of magnitude of about 10^308
but with an accuracy of about 16
digits (details on how the R
base package deals with numbers can be seen in ?.Machine
help) , moreover irrational numbers or whose denominator are not power of 2 are approximate.
So you see that a very large number can be represented by double
, but not so accurately. This can cause major problems with operations such as addition and subtraction. The numbers calculated in the formula variancia
are of the order of (10^12)^2=10^24
in the third example, and we only have 52 bits to represent significant digits (the others are imprecise). When we subtract from each other, we eliminate the "good digits" and only have the "bad digits", causing the absurd result.
One way to solve the problem is to find more stable floating-point algorithms, such as those proposed by Marcos. But supposing this is not possible, you can use arbitrary precision numbers .
No R
the package Rmpfr (Multiple Precision Floating-Point Reliable) provides numbers with arbitrary precision (at the cost of spending more memory and runtime, so depending on your computer and the problem, it is not always possible).
So if it were not possible to somehow improve the calculation algorithm of the variancia
formula, we could use Rmpfr
. We would need more than 30 digits of precision, which would give more than log2(10^30)=99.65
bits. Rounding to 128 bits:
library(Rmpfr)
x2.mpfr <- mpfr(x2, 128)
variancia(x2.mpfr)
1 'mpfr' number of precision 128 bits
[1] 0.0831672741323709434253943823576340475867