How to create a variable, averaging another variable from the same dataset?

1

Imagine I have the following base:

       Country <- c("Brazil", "Brazil", "Brazil", "Brazil", "Brazil","Brazil", "Argentina", "Argentina", "Argentina", "Argentina", "Argentina", "Argentina")
Year <- c(91, 92, 93, 94, 95, 96, 91, 92, 93, 94, 95, 96)
period <- c(1, 1, 2, 2, 3, 3, 1, 1, 2, 2, 3, 3)
values <- c(5,3, 4, 2, 1, 1, 5, 7, 4, 4, 3, 7)
df <- data.frame(country = Country, year = Year, period = period, pib = values)

    country year period value
    Brazil   91      1     5
    Brazil   92      1     3
    Brazil   93      2     4
    Brazil   94      2     2
    Brazil   95      3     1
    Brazil   96      3     1
 Argentina   91      1     5
 Argentina   92      1     7
 Argentina   93      2     4
 Argentina   94      2     4
 Argentina   95      3     3
 Argentina   96      3     7       

From this base I want to create a new variable called media, where the average GDP will be calculated for each country in each period, so that the final result would be:

   country year period pib media
   Brazil   91      1   5     4
   Brazil   92      1   3     4
   Brazil   93      2   4     3
   Brazil   94      2   2     3
   Brazil   95      3   1     1
   Brazil   96      3   1     1
Argentina   91      1   5     6
Argentina   92      1   7     6
Argentina   93      2   4     4
Argentina   94      2   4     4
Argentina   95      3   3     5
Argentina   96      3   7     5

I have no idea how to do this, but I believe there is a way. Can someone give me a light?

PS: I tried to create the best possible example, but I'm still a beginner.

    
asked by anonymous 07.11.2016 / 18:50

1 answer

1

You can use dplyr as follows:

library(dplyr)
df %>%
  group_by(country, period) %>%
  mutate(media = mean(pib))

Source: local data frame [12 x 5]
Groups: country, period [6]

     country  year period   pib media
      <fctr> <dbl>  <dbl> <dbl> <dbl>
1     Brazil    91      1     5     4
2     Brazil    92      1     3     4
3     Brazil    93      2     4     3
4     Brazil    94      2     2     3
5     Brazil    95      3     1     1
6     Brazil    96      3     1     1
7  Argentina    91      1     5     6
8  Argentina    92      1     7     6
9  Argentina    93      2     4     4
10 Argentina    94      2     4     4
11 Argentina    95      3     3     5
12 Argentina    96      3     7     5

To create the variable and save it to the object df , use:

df <- df %>%
  group_by(country, period) %>%
  mutate(media = mean(pib)) %>%
  ungroup()

ungroup is not required, but it is recommended that other operations are not performed per group.

    
07.11.2016 / 19:05