R has many * apply functions that are well explained in the help (e.g. ?apply
). Because there are many, some novice users may have difficulty deciding which is appropriate for their situation or even remembering them all.
-
apply - When you want to apply the function to the rows or columns of an array.
# Matriz de duas dimensões
M <- matrix(seq(1,16), 4, 4)
# apply min às linhas
apply(M, 1, min)
[1] 1 2 3 4
# apply min às colunas
apply(M, 2, max)
[1] 4 8 12 16
# Array tridimensional
M <- array( seq(32), dim = c(4,4,2))
# Aplicar soma em cada M [ * ], - isto é, através de Soma 2 ª e 3 ª dimensão
apply(M, 1, sum)
# O resultado é unidimensional
[1] 120 128 136 144
# Aplicar soma em cada M [ * , * ] - ou seja, através de Soma 3 ª dimensão
apply(M, c(1,2), sum)
# O resultado é bidimensional
[,1] [,2] [,3] [,4]
[1,] 18 26 34 42
[2,] 20 28 36 44
[3,] 22 30 38 46
[4,] 24 32 40 48
-
When you want to apply a function to each element in a list and get a list back.
This is the flagship of many of the other functions * apply.
x <- list(a = 1, b = 1:3, c = 10:100)
lapply(x, FUN = length)
$a
[1] 1
$b
[1] 3
$c
[1] 91
lapply(x, FUN = sum)
$a
[1] 1
$b
[1] 6
$c
[1] 5005
-
sapply - When you want to apply the function to each element of a list, however, you want to return a vector instead of a list.
Instead of using unlist(lapply(...))
, consider using
sapply
.
x <- list(a = 1, b = 1:3, c = 10:100)
#Compare com acima; um vetor chamado , não uma lista
sapply(x, FUN = length)
a b c
1 3 91
sapply(x, FUN = sum)
a b c
1 6 5005
In more advanced uses of sapply
the function will try to result in a multi-dimensional array, if appropriate. For example, if our function returns vectors of the same length, sapply
will use them as columns in an array:
sapply(1:5,function(x) rnorm(3,x))
If our function returns a 2-dimensional array, sapply
will do essentially the same thing, treating each array as a single vector:
sapply(1:5,function(x) matrix(x,2,2))
Unless we specify simplify = "array"
, in which case it will use the individual arrays to construct a multi-dimensional array:
sapply(1:5,function(x) matrix(x,2,2), simplify = "array")
-
vapply - When you want to use sapply
but you may need a faster code.
By vapply
, you basically give R an example of what type of function to return, which can increase your performance.
x <- list(a = 1, b = 1:3, c = 10:100)
# Note que uma vez que o avanço aqui é principalmente a velocidade , este
# Exemplo é apenas para ilustração. Estamos dizendo que R
# Tudo voltou por length () deve ser um número inteiro de
# Comprimento 1.
vapply(x, FUN = length, FUN.VALUE = 0)
a b c
1 3 91
-
mapply - For when you have several different data structures (e.g.
vectors, lists) and you want to apply the function to the first elements of each and then the seconds, etc., forcing the result into a vector or array as in sapply
.
In this case your function must accept multiple arguments.
#Soma os 1ºs elementos, os 2ºs elementos, etc.
mapply(sum, 1:5, 1:5, 1:5)
[1] 3 6 9 12 15
#Para fazer rep(1,4), rep(2,3), etc.
mapply(rep, 1:4, 4:1)
[[1]]
[1] 1 1 1 1
[[2]]
[1] 2 2 2
[[3]]
[1] 3 3
[[4]]
[1] 4
-
When you want to apply the function for each element of a nested list in a recursive way.
#Adiciona ! na string, ou incrementa
myFun <- function(x){
if (is.character(x)){
return(paste(x,"!",sep=""))
}
else{
return(x + 1)
}
}
#Estrutura da lista
l <- list(a = list(a1 = "Boo", b1 = 2, c1 = "Eeek"),
b = 3, c = "Yikes",
d = list(a2 = 1, b2 = list(a3 = "Hey", b3 = 5)))
#O resultado é um vetor ligado ao caractere
rapply(l,myFun)
#O resultado é uma lista como l, porém com os valores alterados
rapply(l, myFun, how = "replace")
-
tapply - For when you want to apply the function to the subset of a vector and these are defined by another vector.
A vector:
x <- 1:20
The factor (of the same size!) defining the groups:
y <- factor(rep(letters[1:5], each = 4))
Add the values in x
in each subgroup defined by y
:
tapply(x, y, sum)
a b c d e
10 26 42 58 74
-
Aggregate and by - It is relatively easy to collect data in
R
using one or more BY
variables and a defined function.