In R, Using dplyr, create a new array

3

Suppose I have the following database

 >data
   zona  candidato votos
    1     A         100
    1     B          20
    2     A          30
    2     B          15

I want, using dplry, the following array

   >nova

   zona  votos_zona   votosA  votosB
     1      120         100       20
     2      45           30       15

I tried something like this

 nova <- data %>%
                     group_by(zona) %>%
                     summarise(votos_zona= sum(votos), 
                               votosA =      ,
                               votosB =         )

But I can not complete the code

    
asked by anonymous 13.10.2014 / 23:11

2 answers

4

Here I think it's worth using another function from another Hadley package, tidyr .

require(tidyr)
data %>% spread(candidato, votos)

  zona   A  B
1    1 100 20
2    2  30 15

Note that if you have multiple candidate names, you will not have to type one by one.

> data <- data.frame(zona = c(1,1,1,1), candidato = c("A", "B", "C", "D"), votos = c(100,20,30,15))
> data
  zona candidato votos
1    1         A   100
2    1         B    20
3    1         C    30
4    1         D    15
> data %>% spread(candidato, votos)
  zona   A  B  C  D
1    1 100 20 30 15
    
14.10.2014 / 16:54
1

You can put the condition inside the sum:

data %>% group_by(zona) %>%summarise(votos_zona = sum(votos),
                                     votosA = sum(votos[candidato=="A"]),
                                     votosB = sum(votos[candidato=="B"]))
Source: local data frame [2 x 4]

  zona votos_zona votosA votosB
1    1        120    100     20
2    2         45     30     15
    
14.10.2014 / 00:46