Bar graph with relative and accumulated frequency

1

I'm trying to make a bar chart with the count on the bars and the relative and accumulated frequencies of the left and right sides respectively but I'm not getting it.

The data is:

dput(x2)
    c(1L, 5L, 3L, 3L, 5L, 3L, 4L, 1L, 2L, 2L, 7L, 3L, 2L, 2L, 3L, 
    3L, 2L, 1L, 5L, 4L, 4L, 3L, 5L, 2L, 6L, 2L, 1L, 2L, 5L, 5L, 5L, 
    3L, 6L, 4L, 5L, 4L, 6L, 7L)

Frequency distributions

table(x2)
x2
1 2 3 4 5 6 7 
4 8 8 5 8 3 2 

Relative frequencies

prop.table(table(x2))
x2
         1          2          3          4          5          6          7 
0.10526316 0.21052632 0.21052632 0.13157895 0.21052632 0.07894737 0.05263158 

What I'm trying to do is exactly as in the image below

    
asked by anonymous 27.04.2017 / 05:16

1 answer

1

This is resolved with the ggplot2 package. First, I construct a data frame with everything that needs to be plotted, with names that have some meaning in this context:

dados <- c(1L, 5L, 3L, 3L, 5L, 3L, 4L, 1L, 2L, 2L, 7L, 3L, 2L, 2L, 3L, 
           3L, 2L, 1L, 5L, 4L, 4L, 3L, 5L, 2L, 6L, 2L, 1L, 2L, 5L, 5L, 5L, 
           3L, 6L, 4L, 5L, 4L, 6L, 7L)

dados.plot <- data.frame(table(dados), table(dados)/sum(table(dados)),
  cumsum(prop.table(table(dados))))
dados.plot <- dados.plot[, -3]
names(dados.plot) <- c("Categoria", "FreqAbsoluta", "FreqRelativa", 
  "FreqCumulativa")
dados.plot$FreqRelativa <- dados.plot$FreqRelativa*100
dados.plot
  Categoria FreqAbsoluta FreqRelativa FreqCumulativa
1         1            4    10.526316      0.1052632
2         2            8    21.052632      0.3157895
3         3            8    21.052632      0.5263158
4         4            5    13.157895      0.6578947
5         5            8    21.052632      0.8684211
6         6            3     7.894737      0.9473684
7         7            2     5.263158      1.0000000   

With the data frame dados.plot prepared, I create a bar chart with the FreqRelativa column. Next, I put the values of FreqAbsoluta above the bars of the chart. Finally, I use the sec_axis function to include a second axis. Notice that I made a transformation in the data of FreqCumulativa , so that the line ended at the same height as the maximum value of the bars. It was enough to take 100 (maximum value of FreqCumulativa ) and divide by the maximum of FreqRelativa .

library(ggplot2)

ggplot(dados.plot, aes(x=Categoria, y=FreqRelativa)) +
  geom_bar(stat="identity") + 
  geom_line(aes(y=FreqCumulativa*max(FreqRelativa), group=1)) +
  labs(x="Categoria", y="Frequência Relativa (%)") + 
  geom_text(aes(label=FreqAbsoluta), vjust=-0.8) +
  scale_y_continuous(
    sec.axis=sec_axis(trans=~ .*100/(max(dados.plot$FreqRelativa)), 
    name = "Frequência Cumulativa (%)"))

The colors, captions, and other characteristics of the chart can be adjusted after a query to the ggplot2 help.

    
27.04.2017 / 15:00