Percentage Frequency in R with dplyr

Question

Percentage Frequency in R with dplyr

Navigation

#1 by (3 votes)
#2 by (0 votes)

1

I was looking to use the dplyr package to calculate the Relative Frequency per group. I have a database like the first three columns below and I would like the last column to be the response variable:

CNPJ    Central             depositos   Resultado final
315406  SICOOB CECRESP      4,61E+13    97,78%
512839  SICOOB CECRESP      1,05E+12    2,22%
68987   SICOOB CREDIMINAS   5,22E+13    33,00%
429890  SICOOB CREDIMINAS   3,88E+13    24,54%
803287  SICOOB CREDIMINAS   3,82E+13    24,15%
804046  SICOOB CREDIMINAS   2,90E+13    18,31%
694877  SICOOB PLANALTO CENTRAL 5,01E+13    100,00%
694389  SICOOB SC/RS        8,75E+13    67,28%
707903  SICOOB SC/RS        4,25E+13    32,72%

Any suggestions? I do not know much about the dplyr package but I made some frustrated attempts like:

dados <- dados %>% 
  group_by(CENTRAL, depositos) %>%
  summarise(value = sum(value)) %>%
  mutate(csum = cumsum(value))

And the Relative Frequency Accumulated by CENTRAL?

r dplyr

asked by anonymous 22.06.2017 / 23:19

2 answers

0

Just to give feedback, from Rafael's programming, the programming that calculated the relative frequency was:

  dados<-dados %>% 
     group_by(CENTRAL) %>% 
     mutate(freq_relat=depositos/sum(depositos)) %>%  
     mutate(freq_relat=round(freq_relat*100, 2))

23.06.2017 / 14:37

Doubt regarding the passage of parameters How to make the pseudo-element before it works?

score 3 · Accepted Answer

3

You can try this:

dados %>% 
    group_by(Central, depositos) %>% 
    mutate(freq_relat=Resultado/sum(Resultado)) %>%  
    mutate(freq_relat=round(freq_relat*100, 2))

23.06.2017 / 01:10