How to return to the most predominant category associated with a group?

1

I have a database, in which the a variable is the group variable and b is a variable with some categories. My goal is, within each group of a , to return what else appears in b .

Consider dput :

dataset=structure(list(a = c(500, 500, 500, 400, 400, 400, 300, 300, 
300), b = structure(c(1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L), .Label = c("a", 
"b"), class = "factor")), class = "data.frame", row.names = c(NA, 
-9L))

Desired result:

  a  b
500  a
400  b
500  a

In addition, it would be useful to return the counts and percentages of this prevalence. Something like:

a    b    count    percent
500  a    2        .66 #66%
400  b    2        .66 #66% 
500  a    2        .66 #66% 
    
asked by anonymous 17.10.2018 / 17:44

1 answer

1

Using the dplyr package:

library(dplyr)
dataset %>% 
  group_by(a, b) %>% 
  summarise(count = n()) %>% 
  mutate(percent = count/sum(count)) %>% 
  filter(count == max(count))
    
17.10.2018 / 18:20