Tapply function - arguments must have the same length

1

Hello, good evening!

I have a date frame with thousands of rows and 58 columns containing, for example, vendor, material, quantity of material, and total value of the material. I made an example below, only what I need in this first moment.

Fornecedor  Material    Qtde    Valor_Total
A   A   1   100
A   B   2   150
A   E   5   26
B   B   6   76
C   A   5   126
C   C   1   58
D   D   10  108
E   E   9   99
E   A   7   30
E   E   8   80
E   E   1   54
F   G   1   0

First, I created a column with the average value of each line

dados$valor_medio <- round(dados$Valor_Total/dados$Qtde,2)

Now I need to calculate the mean, median, and a new mean, by taking the outliers , from dados$valor_medio per material. However, when I apply the tapply function the following error occurs:

dados<-tapply(dados$valor_medio, dados$Material, mean, na.rm = TRUE)

Error in tapply (data $ average_value, data $ Material, mean, na.rm = TRUE):   arguments must have the same length

Could someone help me with this error and tell me how to calculate the average by taking the outliers of dados$valor_medio of each material?

PS: The material is chr

    
asked by anonymous 25.07.2017 / 03:38

1 answer

1

When you give this error, most of the time it is because you should use ?ave and not tapply .

dados$valor_medio <- round(dados$Valor_Total/dados$Qtde,2)

dados$media <- ave(dados$valor_medio, dados$Material, FUN = mean)
dados$mediana <- ave(dados$valor_medio, dados$Material, FUN = median)

As for the other medium, without outliers, it depends on the definition of outliers. Your definition is the one used by the boxplot.stats function, so I'll call this function to calculate the other mean.

media_sem_out <- function(x){
    s <- boxplot.stats(x)$stats
    x <- x[s[1] <= x & x <= s[5]]
    mean(x)
}

dados$media2 <- ave(dados$valor_medio, dados$Material, FUN = media_sem_out)
    
25.07.2017 / 16:42