Percentage Frequency in R with dplyr

1

I was looking to use the dplyr package to calculate the Relative Frequency per group. I have a database like the first three columns below and I would like the last column to be the response variable:

CNPJ    Central             depositos   Resultado final
315406  SICOOB CECRESP      4,61E+13    97,78%
512839  SICOOB CECRESP      1,05E+12    2,22%
68987   SICOOB CREDIMINAS   5,22E+13    33,00%
429890  SICOOB CREDIMINAS   3,88E+13    24,54%
803287  SICOOB CREDIMINAS   3,82E+13    24,15%
804046  SICOOB CREDIMINAS   2,90E+13    18,31%
694877  SICOOB PLANALTO CENTRAL 5,01E+13    100,00%
694389  SICOOB SC/RS        8,75E+13    67,28%
707903  SICOOB SC/RS        4,25E+13    32,72%

Any suggestions? I do not know much about the dplyr package but I made some frustrated attempts like:

dados <- dados %>% 
  group_by(CENTRAL, depositos) %>%
  summarise(value = sum(value)) %>%
  mutate(csum = cumsum(value))

And the Relative Frequency Accumulated by CENTRAL?

    
asked by anonymous 22.06.2017 / 23:19

2 answers

3

You can try this:

dados %>% 
    group_by(Central, depositos) %>% 
    mutate(freq_relat=Resultado/sum(Resultado)) %>%  
    mutate(freq_relat=round(freq_relat*100, 2))
    
23.06.2017 / 01:10
0

Just to give feedback, from Rafael's programming, the programming that calculated the relative frequency was:

  dados<-dados %>% 
     group_by(CENTRAL) %>% 
     mutate(freq_relat=depositos/sum(depositos)) %>%  
     mutate(freq_relat=round(freq_relat*100, 2))
    
23.06.2017 / 14:37