My dataset : pizzas.data
15.0,24.50
20.0,31.50
25.0,45.50
35.0,61.25
45.0,63.00
20.0,38.50
22.5,29.75
27.5,52.50
40.0,63.00
30.0,38.50
o Code:
# -*- coding: utf-8 -*-
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
#treino 50%: pra ficar igual slides
#Obter os dados
pizzas = pd.read_csv('pizzas.data',sep =',',header = None)
#sep default é ','
#usei header = None pq nao tem label na tabela
print(f'header: \n{pizzas.head()}')
print()
print(f'describe: \n{pizzas.describe()}')#da algumas estatisticas!
print()
print(f'INFO: \n{pizzas.info()}')
print(pizzas)
print()
#print(pizzas.ix[:,0:0])#Este sera o X : diametro da pizza
print(pizzas.ix[:,1:1])#Este sera o Y: preço da pizza
#Análise de dados exploratória
#Há uma correlação entre o diametro da pizza e seu preço?
sns.jointplot(x=pizzas.ix[:,0:0],y =pizzas.ix[:,1:1] ,data=pizzas)
pizzas.ix[:,0:0]
represents the diameter of the pie (the first column of my dataset which will be the X of the graph)
pizzas.ix[:,1:1]
is the second column of the dataset and will be the Y of the graph
What's wrong?