Edit: fake data for example
df = matrix(runif(50*507), nrow = 50, ncol = 507)
df = data.frame(df)
df[,1] = seq(as.Date("2017/1/1"), as.Date("2017/2/19"), "days")
names(df) = paste0("var", 1:507)
names(df)[505:507] = c("mktrf", "smb", "hml")
names(df)[1] = "Date"
All the dep var
x = df[,505:507]
All the indep var
y <- df[,2:504]
I have a function called shift I'd like to apply to every column of a df. The function lags variables. The function is as follows, and shifts the specified column(s) by a specified number.
shift<-function(x,shift_by){
stopifnot(is.numeric(shift_by))
stopifnot(is.numeric(x))
if (length(shift_by)>1)
return(sapply(shift_by,shift, x=x))
out<-NULL
abs_shift_by=abs(shift_by)
if (shift_by > 0 )
out<-c(tail(x,-abs_shift_by),rep(NA,abs_shift_by))
else if (shift_by < 0 )
out<-c(rep(NA,abs_shift_by), head(x,-abs_shift_by))
else
out<-x
out
}
When I use the sapply function like this, where y is a dataframe consisting of time series variables I want to lag:
y_lag <- sapply(y,shift,-1 )
I get the following error:
Error: cannot allocate vector of size 54.2 Mb
In addition: Warning messages:
1: In unlist(x, recursive = FALSE) :
Reached total allocation of 8072Mb: see help(memory.size)
2: In unlist(x, recursive = FALSE) :
Reached total allocation of 8072Mb: see help(memory.size)
3: In unlist(x, recursive = FALSE) :
Reached total allocation of 8072Mb: see help(memory.size)
4: In unlist(x, recursive = FALSE) :
Reached total allocation of 8072Mb: see help(memory.size)
5: In unlist(x, recursive = FALSE) :
Reached total allocation of 8072Mb: see help(memory.size)
6: In unlist(x, recursive = FALSE) :
Reached total allocation of 8072Mb: see help(memory.size)
My question: can I use a different method to lag every element of a column, while still using the lm package? Or how do I address the memory issue I am having? I can't use a different computer.