Lags and Leads in R

Question

Is it possible to create l(1/4) and f(1/4) (operators of Stata) and add it to the panel data in R? I cannot find a plausible R function.

My idea:

crisisdata <- pdata.frame(crisisdata, index = c("year"))
crisisdata <- crisisdata %>%
  mutate(l1_ltd= Lag(ltd, 1)) %>%
  mutate(l2_ltd= Lag(ltd, 2)) %>%
  mutate(l3_ltd= Lag(ltd, 3)) %>%
  mutate(l4_ltd= Lag(ltd, 4)) %>%
  mutate(f0_ltd= ltd) %>%
  mutate(f1_ltd= dplyr::lead(ltd, n = 1, default = NA)) %>%
  mutate(f2_ltd= dplyr::lead(ltd, n = 2, default = NA)) %>%
  mutate(f3_ltd= dplyr::lead(ltd, n = 3, default = NA)) %>%
  mutate(f4_ltd= dplyr::lead(ltd, n = 4, default = NA))

But it doesn't work. The result was right for the all years for country A. But for all other countries I only have NA values.

Welcome to SO, Tanja! Questions on SO (especially in R) do much better if they are reproducible and self-contained. By that I mean including attempted code (please be explicit about non-base packages), sample representative data (perhaps via `dput(head(x))` or building data programmatically (e.g., `data.frame(...)`), possibly stochastically), perhaps actual output (with verbatim errors/warnings) versus intended output. Refs: https://stackoverflow.com/q/5963269, [mcve], and https://stackoverflow.com/tags/r/info. — r2evans, Jul 16 '22 at 20:40

langtang · Answer 1 · 2022-07-16T23:31:29.170

You can do this fairly easily with data.table::shift(), which allows multiple leads and lags.

# load library
library(data.table)

# set crisisdata to data.table
setDT(crisisdata)

# add lags
crisisdata[, (paste0("l",1:4, "_ltd")):= shift(ltd,1:4), by=country]

# add leads
crisisdata[, (paste0("f",1:4, "_ltd")):= shift(ltd,-1:-4), by=country]

Output:

      year country         ltd      l1_ltd     l2_ltd     l3_ltd     l4_ltd      f1_ltd      f2_ltd      f3_ltd     f4_ltd
     <int>  <char>       <num>       <num>      <num>      <num>      <num>       <num>       <num>       <num>      <num>
  1:  2000       A -0.56047565          NA         NA         NA         NA -0.23017749  1.55870831  0.07050839  0.1292877
  2:  2001       A -0.23017749 -0.56047565         NA         NA         NA  1.55870831  0.07050839  0.12928774  1.7150650
  3:  2002       A  1.55870831 -0.23017749 -0.5604756         NA         NA  0.07050839  0.12928774  1.71506499  0.4609162
  4:  2003       A  0.07050839  1.55870831 -0.2301775 -0.5604756         NA  0.12928774  1.71506499  0.46091621 -1.2650612
  5:  2004       A  0.12928774  0.07050839  1.5587083 -0.2301775 -0.5604756  1.71506499  0.46091621 -1.26506123 -0.6868529
 ---                                                                                                                      
120:  2026       D -1.02412879 -0.84970435 -0.6407060  0.1056762  0.3011534  0.11764660 -0.94747461 -0.49055744 -0.2560922
121:  2027       D  0.11764660 -1.02412879 -0.8497043 -0.6407060  0.1056762 -0.94747461 -0.49055744 -0.25609219         NA
122:  2028       D -0.94747461  0.11764660 -1.0241288 -0.8497043 -0.6407060 -0.49055744 -0.25609219          NA         NA
123:  2029       D -0.49055744 -0.94747461  0.1176466 -1.0241288 -0.8497043 -0.25609219          NA          NA         NA
124:  2030       D -0.25609219 -0.49055744 -0.9474746  0.1176466 -1.0241288          NA          NA          NA         NA

Input:

set.seed(123)
crisisdata = data.frame(
  year = rep(2000:2030, 4),
  country = rep(LETTERS[1:4], each=31),
  ltd = rnorm(124)
)

`setDT()` converts your dataframe into a data.table.. Check `class(crisisdata)` before and after running `setDT(crisisdata)` — langtang, Sep 07 '22 at 20:57

Lags and Leads in R

1 Answers1