To illustrate, I will create 3 different data.frames of example, one with the variable x
, another with variable y
and another with the variable z
for the same individuals id
:
### exemplos ####
set.seed(1)
df1 <- data.frame(id=1:10, y = rnorm(10))
df2 <- data.frame(id=1:10, z = rnorm(10))
df3 <- data.frame(id=1:10, x = rnorm(10))
With the base functions of R, one way to merge directly from the three is to combine the Reduce
function with the merge
:
resultado <- Reduce(function(x,y) {merge(x,y)}, list(df1, df2, df3))
resultado
id y z x
1 1 -0.6264538 1.51178117 0.91897737
2 2 0.1836433 0.38984324 0.78213630
3 3 -0.8356286 -0.62124058 0.07456498
4 4 1.5952808 -2.21469989 -1.98935170
5 5 0.3295078 1.12493092 0.61982575
6 6 -0.8204684 -0.04493361 -0.05612874
7 7 0.4874291 -0.01619026 -0.15579551
8 8 0.7383247 0.94383621 -1.47075238
9 9 0.5757814 0.82122120 -0.47815006
10 10 -0.3053884 0.59390132 0.41794156
In the plyr package, there is the join_all
function, which basically does the even though it was done above, but with a simpler syntax:
library(plyr) ### se você ainda não tem o pacote, você tem que instalar antes
resultados <- join_all(list(df1,df2,df3))
resultados
id y z x
1 1 -0.6264538 1.51178117 0.91897737
2 2 0.1836433 0.38984324 0.78213630
3 3 -0.8356286 -0.62124058 0.07456498
4 4 1.5952808 -2.21469989 -1.98935170
5 5 0.3295078 1.12493092 0.61982575
6 6 -0.8204684 -0.04493361 -0.05612874
7 7 0.4874291 -0.01619026 -0.15579551
8 8 0.7383247 0.94383621 -1.47075238
9 9 0.5757814 0.82122120 -0.47815006
10 10 -0.3053884 0.59390132 0.41794156
Or if you just want to type in a cleaner way, you can use the magrittr
package % that has a forward pipe operator for R. With this package, the merge
of the 3 data.frames can also be done in a row only by chaining the arguments with the %>%
operator:
library(magrittr) ### se você ainda não tem o pacote, você tem que instalar antes
resultado <- df1%>%merge(df2)%>%merge(df3)
resultado
id y z x
1 1 -0.6264538 1.51178117 0.91897737
2 2 0.1836433 0.38984324 0.78213630
3 3 -0.8356286 -0.62124058 0.07456498
4 4 1.5952808 -2.21469989 -1.98935170
5 5 0.3295078 1.12493092 0.61982575
6 6 -0.8204684 -0.04493361 -0.05612874
7 7 0.4874291 -0.01619026 -0.15579551
8 8 0.7383247 0.94383621 -1.47075238
9 9 0.5757814 0.82122120 -0.47815006
10 10 -0.3053884 0.59390132 0.41794156
Remembering that it's always good to specify the identifier columns of merge
(option by
of function), otherwise you might end up with something different than expected. In the above case this is not necessary because we only have one column in common.