Repeating the subtraction of groups in a data frame for all numeric variables

6

I have the following code:

 df <- data.frame(grp = rep(letters[1:3], each = 2), 
                     index = rep(1:2, times = 3), 
                     value = seq(10, 60, length.out = 6),
                     value2 = seq(20, 70, length.out = 6),
                     value3 = seq(30, 80, length.out = 6))

library(tidyverse)
tbl_df(df) #para melhor visualização

# grp   index value value2 value3
# <fct> <int> <dbl>  <dbl>  <dbl>
# 1 a      1    10     20     30
# 2 a      2    20     30     40
# 3 b      1    30     40     50
# 4 b      2    40     50     60
# 5 c      1    50     60     70
# 6 c      2    60     70     80

# resultado esperado:
# grp   index value value2 value3
# <fct> <int> <dbl>  <dbl>  <dbl>
# 1 a      1    10     20     30
# 2 a      2    20     30     40
# 3 b      1   -20    -20    -20
# 4 b      2   -20    -20    -20
# 5 c      1    50     60     70
# 6 c      2    60     70     80

# subtrair um grupo de outro
df$value[df$grp=="b"]  = df$value[df$grp=="b"]  - df$value[df$grp=="c"]
df$value2[df$grp=="b"] = df$value2[df$grp=="b"] - df$value2[df$grp=="c"]
df$value3[df$grp=="b"] = df$value3[df$grp=="b"] - df$value3[df$grp=="c"]

How do I subtract all the value # from the group 'c' of the value # from group 'b', at once,

df$value[df$grp=="b"]  = df$value[df$grp=="b"]  - df$value[df$grp=="c"]

for each variable?

    
asked by anonymous 03.12.2018 / 15:55

2 answers

7

The following does whatever it wants with base R.

ib <- which(df$grp == "b")
ic <- which(df$grp == "c")
df[3:5] <- lapply(df[3:5], function(x){
  x[ib] <- x[ib] - x[ic]
  x
})

df
#  grp index value value2 value3
#1   a     1    10     20     30
#2   a     2    20     30     40
#3   b     1   -20    -20    -20
#4   b     2   -20    -20    -20
#5   c     1    50     60     70
#6   c     2    60     70     80

Now, get the house ready. The variables ib and ic used to index the vectors to transform are no longer necessary.

rm(ib, ic)
    
03.12.2018 / 16:12
2

You can do this with dplyr :

bind_rows(
  df %>% 
    filter(grp == "a"),

  df %>% 
    filter(grp != "a") %>% 
    group_by(index) %>% 
    mutate_at(vars(starts_with("value")), funs(. - lead(., order_by = grp, default = 0)))
)

  grp index value value2 value3
1   a     1    10     20     30
2   a     2    20     30     40
3   b     1   -20    -20    -20
4   b     2   -20    -20    -20
5   c     1    50     60     70
6   c     2    60     70     80

The code is a little weird because there is group A. If in practice you always subtracted the value of the previous group by the value of the next group you could ignore bind_rows and filters.

    
05.12.2018 / 12:49