How to make mobile sum in R?

6

I have a 1:50 vector and I need to make a moving sum (equal to the moving average), ie in the case of the last 5 observations, the new vector would be c(sum(1:5), sum(2:6), sum(3:7), ..., sum(45:49), sum(46:50)) . The aggregate function has aggregate(presidents, nfrequency = 1, FUN = weighted.mean, w = c(1, 1, 0.5, 1)) that was the closest I got to the solution without using a for

    
asked by anonymous 17.12.2018 / 15:51

2 answers

7

I know two good packages to do this. The zoo (as Rui quoted in the comment) and RcppRoll .

> zoo::rollsum(1:20, k = 5)
 [1] 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90
> RcppRoll::roll_sum(1:20, n = 5)
 [1] 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90

In terms of performance, RcppRoll is much faster:

> bench::mark(
+   zoo::rollsum(1:50, k = 5),
+   RcppRoll::roll_sum(1:50, n = 5)
+ )
# A tibble: 2 x 14
  expression     min     mean  median    max 'itr/sec' mem_alloc  n_gc n_itr total_time result memory time  gc   
  <chr>      <bch:t> <bch:tm> <bch:t> <bch:>     <dbl> <bch:byt> <dbl> <int>   <bch:tm> <list> <list> <lis> <lis>
1 zoo::roll… 909.4µs   3.45ms  1.71ms 40.3ms      290.   18.91KB     0   155      535ms <int … <Rpro… <bch… <tib…
2 RcppRoll:…  40.5µs 150.75µs 89.49µs 14.6ms     6634.    3.34KB     0  3316      500ms <dbl … <Rpro… <bch… <tib…
    
17.12.2018 / 16:49
5

There are a few ways you can calculate the mobile sum in the :

R-base

diff(c(0, cumsum(1:10)), 5)
# 15 20 25 30 35 40

This proposal can be generalized as a function:

soma_movel <- function(x, n) {
  diff(c(0, cumsum(x)), n)
}

Zoo

The zoo package, as raised in the comments, has a function for this, but it does not perform very well

zoo::rollsum(1:10, 5)
# 15 20 25 30 35 40

Comparison

set.seed(123)
vetor <- rnorm(1e5) # 100 mil números

# As funções retornam valores iguais?
all.equal(zoo::rollsum(vetor, 5), soma_movel(vetor, 5))
# [1] TRUE
Finally, a comparison on the performance of the solutions raised shows that even though it is about 80 times faster than with zoo , the solution with base still loses to the solution with RcppRoll presented by Daniel in 5 times.

microbenchmark::microbenchmark(
  zoo = zoo::rollsum(vetor, 5),
  base = soma_movel(vetor, 5), 
  cpp = RcppRoll::roll_sum(vetor, n = 5),
  times = 30
)
Unit: microseconds
 expr        min         lq       mean     median         uq        max neval cld
  zoo 200659.545 204218.475 208418.887 206276.601 209928.673 255552.267    30   b
 base   2229.273   2536.157   3379.694   2633.918   2755.286   7725.985    30  a 
  cpp    452.116    514.725   6966.097    558.089    577.333 188068.577    30  a 
    
17.12.2018 / 17:03