How to compare columns from one dataframe with those from others and remove columns that are not common between them [R]

1

Suppose I have 3 dataframes. In them, I have varying columns (eg x1, x2 ..., xn). However, not all of these columns coexist in all dataframes. My goal is to compare these dataframes and leave, EACH OF THEM, with the columns in common.

Is it possible to perform this procedure with only a UMA function?

    
asked by anonymous 23.08.2018 / 21:56

3 answers

4

The basic idea is this, you only have one function to automate the process and cover more dataframes.

df1 = data.frame(x1=runif(5,0,5), x2=runif(5,5,10), x3=runif(5,0,5), x4=runif(5,10,15))
df2 = data.frame(x1=runif(5,0,5), x2=runif(5,5,10), x4=runif(5,10,15))
df3 = data.frame(x2=runif(5,0,5), x3=runif(5,5,10), x4=runif(5,10,15))

idem_cols <- intersect(intersect(colnames(df1), colnames(df2)), colnames(df3))

> df1[idem_cols]
#        x2       x4
#1 6.393069 12.99105
#2 7.016564 12.57616
#3 9.451348 11.62159
#4 5.728012 11.23728
#5 8.795608 13.79248

> df2[idem_cols]
#        x2       x4
#1 9.489572 12.21699
#2 7.423554 11.57359
#3 5.058671 10.75123
#4 9.319093 10.00097
#5 5.620968 14.91703

> df3[idem_cols]
#         x2       x4
#1 2.5554488 13.83610
#2 4.4639556 10.05555
#3 4.1599600 14.10665
#4 0.4610773 10.21153
#5 2.9923365 14.80820
    
23.08.2018 / 22:32
2

A complement to the @Fernandes answer is given below:

list<-list(df1[idem_cols],df2[idem_cols],df3[idem_cols])
list # cria uma lista com as colunas comuns dos dataframes

> list
[[1]]
    x2       x4
1 7.796689 14.54941
2 9.473103 14.15803
3 7.818807 10.96527
4 6.381239 14.44439
5 9.552761 12.73286

[[2]]
    x2       x4
1 5.755445 11.08562
2 8.305431 11.57553
3 7.006299 12.62098
4 7.949986 13.11914
5 6.095582 10.30344

[[3]]
     x2       x4
1 0.6701076 14.23146
2 4.5605675 11.67825
3 0.8683714 11.08652
4 2.9171325 10.14618
5 3.8379593 14.99512

After, create a specific name for each dataframe through a for loop:

for(i in 1:length(list)){
    assign(paste('df',i,sep=''),
    value=data.frame(list[[i]]))
}

the result is:

This will be useful for applying some functions (such as tapply to multiple dataframes).

    
24.08.2018 / 05:07
0

Another function would be:

result<-Reduce(function(x,y)intersect(x,y),list(colnames(df1),colnames(df2),colnames(df3),colnames(df4),colnames(df5)))

where you can compare how many dataframes you want.

    
24.08.2018 / 16:41