I have some data in Excel that I wanted to generate a graphical correlation matrix in Python. I looked in the Seaborn documentation and found a an example code on the documentation page, but my data that I want to compute the correlation I put each one of them in a list, in case the lists A, B, C, D and E of my code.
But my code is not running, I do not know if it would be the case to calculate the direct correlation in the Excel table without having to put each column in a list?
Also in the documentation code I did not understand the first line from string import ascii_letters
. When I get the code straight from the documentation page it runs but it does not generate any graphics.
In the end, my code looks like this.
# Importando Bibliotecas
from string import ascii_letters
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
#Criação das Listas nas variaveis a,b,c,d,e a partir do Arquivo do Excel
a = []
b = []
c = []
d = []
e = []
dataset = open ("Teste_Plotagem.csv","r")
for line in dataset:
line = line.strip()
A,B,C,D,E = line.split(";")
a.append(A)
b.append(B)
c.append(C)
d.append(D)
e.append(E)
dataset.close()
# Tentando Fazer a correlação entre as listas
sns.set(style="white")
corr = cor(a,b,c,d,e)
# Gerando a mascara da diagonal superior da matriz de correlação
# Daqui para baixo eu peguei do código da documentação
mask = np.zeros_like(corr, dtype=np.bool)
mask[np.triu_indices_from(mask)] = True
# Definindo a plotagem no matplotlib
f, ax = plt.subplots(figsize=(11, 9))
# Gerando o mapa de cores a partir do seaborn
cmap = sns.diverging_palette(220, 10, as_cmap=True)
# Desenhando o heatmap com a mascara
sns.heatmap(corr, mask=mask, cmap=cmap, vmax=.3, center=0,
square=True, linewidths=.5, cbar_kws={"shrink": .5})