R - How to create a delayed variable (lag) conditioned to the individual?

3

I need to delay a variable from my db ( dCoopCred ). However, it can not mix the delay of two individuals (% with%). I would like CNPJ to be LAG_Result_ant_desp in t-1 (previous period).

Example:

structure(list(CNPJ = c(5834, 5834, 5834, 5834, 5834, 9797, 9797, 
9797, 9797, 9797), ano = c(2006, 2007, 2008, 2009, 2010, 2006, 
2007, 2008, 2009, 2010), PIB = c(4, 6, 5, 1, 7, 4, 6, 5, 1, 7
), Result_ant_desp = c(5000, 7000, 6000, 2000, 3500, 1500, 2600, 
3000, 2100, 3100), LAG_Result_ant_desp = structure(c(9L, 6L, 
8L, 7L, 2L, 9L, 1L, 4L, 5L, 3L), .Label = c("1500", "2000", "2100", 
"2600", "3000", "5000", "6000", "7000", "N/A"), class = "factor")), class = "data.frame", row.names = c(NA, 
-10L))

I was able to delay a period using the package Result_ant_desp and the command

dCoopCred$LAG_result_ant_desp <- Lag(dCoopCred$result_ant_desp, +1)

However, only this command ends up blending Hmisc from different years and result_ant_desp .

I'm also using the code

teste <- dCoopCred %>% 
  distinct(CNPJ, ano, .keep_all = TRUE) %>% 
  group_by(CNPJ) %>% 
  mutate(LAG_result_ant_desp = lead(result_ant_desp, n = 1L)) %>% 
  select(-result_ant_desp) %>% 
  ungroup() %>% 
  left_join(dCoopCred, ., by = c("ano", "CNPJ")) 

You have done what I wanted, but this one generating another db, I would like the variable to be created in dCoopCred

    
asked by anonymous 15.06.2018 / 01:34

1 answer

4

There is a simpler way of doing what the question asks. Instead of pipes %>% , use ave . Note: The lag function that will be executed is that of the dplyr package.

library(dplyr)

dCoopCred$LAG_Result_ant_desp <- with(dCoopCred, ave(Result_ant_desp, CNPJ, FUN = lag, -1))

dCoopCred
#   CNPJ  ano PIB Result_ant_desp LAG_Result_ant_desp
#1  5834 2006   4            5000                  NA
#2  5834 2007   6            7000                5000
#3  5834 2008   5            6000                7000
#4  5834 2009   1            2000                6000
#5  5834 2010   7            3500                2000
#6  9797 2006   4            1500                  NA
#7  9797 2007   6            2600                1500
#8  9797 2008   5            3000                2600
#9  9797 2009   1            2100                3000
#10 9797 2010   7            3100                2100

Data.
Since the data in the question already has the new column, here it goes only with the first four columns, in format dput .

dCoopCred <-
structure(list(CNPJ = c(5834, 5834, 5834, 5834, 5834, 9797, 9797, 
9797, 9797, 9797), ano = c(2006, 2007, 2008, 2009, 2010, 2006, 
2007, 2008, 2009, 2010), PIB = c(4, 6, 5, 1, 7, 4, 6, 5, 1, 7
), Result_ant_desp = c(5000, 7000, 6000, 2000, 3500, 1500, 2600, 
3000, 2100, 3100)), .Names = c("CNPJ", "ano", "PIB", "Result_ant_desp"
), row.names = c(NA, -10L), class = "data.frame")
    
15.06.2018 / 13:40