example column 1 = 1 2 3 4 0 0 0 the normal average of this would give 1.428571 however ignoring the 0 would be 2.5, I would like to know how to do this by ignoring the 0 column values.
example column 1 = 1 2 3 4 0 0 0 the normal average of this would give 1.428571 however ignoring the 0 would be 2.5, I would like to know how to do this by ignoring the 0 column values.
Assuming the dataset is called dados
and has two columns named c1
and c2
, with the following values:
dados <- data.frame(c1=c(1:4, rep(0, 3)), c2=7:1)
dados
c1 c2
1 1 7
2 2 6
3 3 5
4 4 4
5 0 3
6 0 2
7 0 1
do the following:
mean(dados[dados$c1!=0, 1])
The above code selects the lines of dados
whose values in the first column are different from 0. In addition, consider only the first column of the data frame. With the correct rows and columns selected, just calculate the mean value.
An alternative way to call the first column is, instead of putting the number 1, call it by name, as the command below does:
dados[dados$c1!=0, "c1"]
The result will be the same regardless of the method used.
An alternative answer to the one proposed by Marcus Nunes would be to use the example below. I took the liberty of using the dataset proposed by him. I will use this example of the packages below. Make sure you have them on your computer.
library(dplyr)
library(magrittr)
dados <- data.frame(c1=c(1:4, rep(0, 3)), c2=7:1)
dados
Assuming you want the average of column c1, you can use the dplyr filter
function to filter the nonzero data from column c1. An object is created with the result of this operation, to then use the summarise
function of the same package to generate a column called MEDIA with the average of the filtered data. The solution is presented in two ways, with or without the pipe of the magrittr package.
Without the pipe of the magritrr package
dt.filtro <- filter(dados, c1 != 0)
summarise(dt.filtro, MEDIA = mean(c1))
With magrittr's pipe , you can avoid creating objects like the 'dt.filter' from the above example.
dados %>% filter(c1 != 0) %>% summarise(MEDIA = mean(c1))
If you want to see the averages of the two columns, based on the exclusion of values from column c1, for example, you only have to use a variant of summarise
called summarise_each
.
dados %>% filter(c1 != 0) %>% summarise_each(funs(mean))