I need to separate the data into groups and perform the calculations in two or three groups / dimensions.
I found the tapply function, it solves the problem. With it I get what I need using the average function, sum, etc.
But now, I realized that I need to homogenize the data in the selected groups, so instead of the function being average, sum and etc, I need to create a function that homogeinize and then apply to tapply. I think my homogenization function is in trouble, but I can not figure out what.
I have tried with dplyr, data.table, add following the idea of the link next, but all give error. How to consolidate (aggregate or group) ) the values in a database?
Below is the code I have:
bairro <- c("B_FLORESTA", "B_PINHEIRAO", "B_PINHEIRAO", "B_PINHEIRINHO",
"B_LUTHER KING", "B_LUTHER KING", "B_VILA NOVA", "B_VILA NOVA",
"B_NOVA PETROPOLIS", "B_VILA NOVA", "B_INTERIOR", "B_ALVORADA",
"B_SADIA", "B_SADIA", "B_SADIA", "B_SADIA", "B_SADIA", "B_SADIA",
"B_SADIA", "B_JUPTER", "B_JUPTER", "B_FLORESTA", "B_ITALIA",
"B_ITALIA", "B_ITALIA", "B_ITALIA")
tipo <- c("CASA", "CASA", "COMERCIAIS", "CASA", "CASA", "COMERCIAIS",
"APARTAMENTO", "APARTAMENTO", "APARTAMENTO", "APARTAMENTO",
"SITIO", "APARTAMENTO", "CASA", "CASA", "CASA", "CASA",
"TERRENO", "TERRENO", "CASA", "CASA", "CASA", "CASA",
"CASA", "CASA", "CASA", "CASA")
valor <- c(1167, 2500, 1125, 2286, 400, 400, 1500, 1500, 300, 1500, 555,
973, 2500, 2556, 2500, 2556, 600, 850, 2338, 1857, 1857, 2000,
2000, 2063, 2000, 2063)
data <- c("2015_07", "2015_07", "2015_07", "2015_07", "2015_07", "2015_07",
"2015_07", "2015_07", "2015_08", "2015_08", "2015_08", "2015_08",
"2015_08", "2015_08", "2015_08", "2015_08", "2015_08", "2015_08",
"2015_09", "2015_09", "2015_09", "2015_09", "2015_09", "2015_09",
"2015_09", "2015_09")
dados <- data.frame(bairro, tipo, valor, data)
x <- tapply(dados$valor, list(dados$tipo, dados$data, dados$bairro), median)
## ok, esse é o resultado final 1.
So far blz, but now, I need to homogenize, this is where my problem is !! Here is one of the functions for this:
homo <- function (a){
a <- a[order(a$valor),] # ordenar o pvalor
n <- nrow(a)
a
for(i in 1:n){
a$sobra[i] = round(((a$valor[i+1] / a$valor[i])*100)-100, dig = 2)
}
a <- subset (a, a$sobra < 50) # ponto de corte < 50
return (a)
}
When you apply the "homo" function on the tapply, it gives error.
tapply(dados$valor, list(dados$tipo, dados$data, dados$bairro), homo)
Can anyone help me?