loop in correlation matrix in R

5

I have tried to learn about loops and functions in R. So I set myself to the following situation: I have a pairwise correlation matrix:

dados<-matrix(rnorm(100),5,5)
colnames(dados)<-c('A','B','C','D','E')
rownames(dados)<-c('A','B','C','D','E')
dados
cor<-cor(dados)

I want to use loop and if conditions to get only combinations of variables with > 0.5 of the color object. However, I can not find a way to go through the rows and columns of my matrix.

I have tried the following code:

for (i in 1:nrow(cor)){
  for (j in 1:ncol(cor)){
    # comando para comparar par a par
    if (cor[i,j]>0.5){
      #retornar um nova matrix com variável e valor > 0.5
    }
  }
} 

Can anyone help me with these commands?

    
asked by anonymous 06.03.2016 / 16:59

2 answers

0

An easy way is to use the melt function of the reshape2 package, as explained in this question .

code:

install.packages("reshape2")             # Caso você não tenha instalado o pacote ainda
library(reshape2)

set.seed(0101)
dados <- matrix(rnorm(100),5,5)
colnames(dados) <- c('A','B','C','D','E')
rownames(dados) <- c('A','B','C','D','E')
CorMatrix <- cor(dados)                    # Tente usar nomes de variáveis que não sejam 
                                           # também nome de função

CM <- corMatrix                            # Copiando sua matriz
CM[lower.tri(CM, diag = TRUE)] <- NA       # Removendo as correlações repetidas e a diagonal
rownames(resultados) <- NULL               # (não necessário) limpando os nomes das linhas
resultado <- subset(                      # Filtra as linhas que possuem o valor de correlação
    melt(CM, na.rm=T),                     # maior do que você queira (0.5 no caso)
    value > 0.5) 

result:

>resultado
   Var1  Var2   value
1   C     D    0.5215197

A tip for your code, do not use the name of a function as a variable name, as was the case with the cor matrix.

    
10.03.2016 / 23:06
1

Assuming you want to use loops (for training or for another reason, because in this case you do not need to use loops), you can store results in a list.

Recreating your data (with set.seed() for reproducibility):

set.seed(10)
dados <- matrix(rnorm(100),5,5)
colnames(dados) <- c('A','B','C','D','E')
rownames(dados) <- c('A','B','C','D','E')
cor <- cor(dados)

Traversing the loop and saving results in a list:

# lista para armazenar resultado
resultados <- list()

for (i in 1:nrow(cor)){
  for (j in 1:ncol(cor)){
    if (cor[i,j]>0.5){
      # armazena no primeiro nível a linha e no segundo nível a coluna
      resultados[[rownames(cor)[i]]][[colnames(cor)[j]]] <- cor[i,j]
    }
  }
}

resultados
$A
        A         C 
1.0000000 0.7764006 

$B
       B        E 
1.000000 0.912793 

$C
        A         C 
0.7764006 1.0000000 

$D
D 
1 

$E
       B        E 
0.912793 1.000000

With the list in hand you can sort the data any way you want. For example, the simplest way to transform into a vector is with unlist() .

unlist(resultados)

     A.A       A.C       B.B       B.E       C.A       C.C       D.D       E.B       E.E 
1.0000000 0.7764006 1.0000000 0.9127930 0.7764006 1.0000000 1.0000000 0.9127930 1.0000000 

But remember that you do not have to use loops in this case. For example, one way to get the same result as above would be:

indices <- which(cor > 0.5, arr.ind = TRUE)
res <- setNames(cor[indices], paste(colnames(cor)[indices[,2]], rownames(cor)[indices[,1]], sep = "."))
res
     A.A       A.C       B.B       B.E       C.A       C.C       D.D       E.B       E.E 
1.0000000 0.7764006 1.0000000 0.9127930 0.7764006 1.0000000 1.0000000 0.9127930 1.0000000 
    
06.03.2016 / 23:52