r - average of one variable in relation to the values of another variable in a data frame within each grouping

1

This question is after a post by me recently. This is the link if you want to follow #

I have a multi-column dataframe. How do I calculate the average of one of the variables based on the values of another variable within a grouping of one of the columns? That is, I have the frequency of several species found in 4 campaigns divided into 2 stages and I want to calculate the average of each species recorded in each place within each stage, the average being realized with the campaigns of that stage. That is, the average frequency of the species is based on all the campaigns carried out within that stage and not based only on the campaigns in which the species is registered or based on ALL the campaigns, regardless of the stage. the script I am using based on your help is this

#somar todos os registros de cada sp no local em cada campanha.

dados_anura = dados_sapo %>%
  group_by(etapa, campanha,  local,  especie) %>%
  summarise(sum(frequencia))
## Vou lá na tabela e troco o nome da coluna "sum(frequencia)" por frequencia
write.table(dados_anura, 'dados_anura.csv', sep = ';', row.names = F)


# Salvo e chamo aqui de novo

dados_anuras <- read.csv("dados_anura.csv", header = TRUE, sep=";")

#média com base em todas as campanhas mesmo que não haja registro da espécie.
# calcular as médias das campanhas agrupadas por especie e local, com todas as campanhas e não só aquelas em que há registro da espécie.
# Definir uma função mediaCamp que faça esses cálculos.Depois, usa-se mais uma vez o aggregate.

mediaCamp <- function(x){
  ncamp <- length(unique(dados_anuras$campanha))
  sum(x)/ncamp
}

dadomean4 <- aggregate(frequencia ~ etapa, local + especie, dados_anuras, mediaCamp)
### Para retirar os NA's
dadomean4[is.na(dadomean4)] <- 0

But the result is going wrong. That way, the average calculation is based on ALL the campaigns, not based on the campaigns only from that stage, even giving the value (in the cell) to that stage.

etapa  campanha	local	especie	frequencia
A1        1	      A	    aa	      1
A1        1	      A	    bb	      2
A1        1	      A	    cc	      1
A1        1	      B	    bb	      1
A1        1	      B	    dd	      7
A1        2	      A	    aa	      50
A1        2	      A	    bb	      1
A1        2	      A	    dd	      8
A2        3        A	  aa	      2
A2        3	      B	    aa	      3
A2        3	      B	    dd	      3
A2        4	      A	    aa	      33
A2        4	      A	    bb	      5
A2        4	      A	    cc	      1
A2        4	      A	    dd	      1
A2        4	      B	    aa	      18
A2        4	      B	    bb	      10
A2        4	      B	    dd	      6
    
asked by anonymous 10.08.2018 / 21:03

1 answer

4

The average of each species registered in each location within each step :

dplyr::group_by(data, especie, local, etapa) %>% summarise(Total=mean(frequencia))
# A tibble: 13 x 4
# Groups:   especie, local [?]
#   especie local etapa Total
#   <fct>   <fct> <fct> <dbl>
# 1 aa      A     A1     25.5
# 2 aa      A     A2     17.5
# 3 aa      B     A2     10.5
# 4 bb      A     A1      1.5
# 5 bb      A     A2      5  
# 6 bb      B     A1      1  
# 7 bb      B     A2     10  
# 8 cc      A     A1      1  
# 9 cc      A     A2      1  
# 10 dd     A     A1      8  
# 11 dd     A     A2      1  
# 12 dd     B     A1      7  
# 13 dd     B     A2      4.5
    
10.08.2018 / 23:50