I'm trying to replace NA & zero values recursive. Im working on time series data where a NA or zero is best replaced with the value previous week (every 15min measurement so 672 steps back). My data contains ~two years data of 15min values, thus this is a large set. Not much NA or zeros are expected and adjacent series of zero's or NA >672 are also not expected.
I found this thread (recursive replacement in R) where a recursive way is shown, adapted it to my problem.
load[is.na(load)] <- 0
o <- rle(load)
o$values[o$values == 0] <- o$values[which(o$values == 0) - 672]
newload<-inverse.rle(o)
Now is this "the best" or an elegant method? And how will I protect my code from errors when a zero value occurs within the first 672 values?
Im used to matlab, where I would do something like:
% Replace NaN with 0
Load(isnan(Load))=0;
% Find zero values
Ind=find(Load==0);
for f=Ind
if f>672
fprintf('Replacing index %d with the load 1 day ago\n', Ind)
% Replace zero with previous week value
Load(f)=Load(f-672);
end
end
As im not familiar to R how would i set such a if else loop up?
A reproducible example(change the code as the example used from other thread didnt cope with adjacent zeros):
day<-1:24
load<-rep(day, times=10)
load[50:54]<-0
load[112:115]<-NA
load[is.na(load)] <- 0
load[load==0]<-load[which(load == 0) - 24]
Which gives the original load dataframe without zero's and NA's. When in the first 24 values a zero exist, this goes wrong because there is no value to replace with:
loadtest[c(10,50:54)]<-0 # instead of load[50:54]<-0 gives:
Error in loadtest[which(loadtest == 0) - 24] :
only 0's may be mixed with negative subscripts
Now to work around this an if else statement can be used, but i dont know how to apply. Something like:
day<-1:24
loadtest<-rep(day, times=10)
loadtest[c(10,50:54)]<-0
loadtest[112:115]<-NA
loadtest[is.na(loadtest)] <- 0
if(INDEX(loadtest[loadtest==0])<24) {
# nothing / mean / standard value
} else {
loadtest[loadtest==0]<-loadtest[which(loadtest == 0) - 24]
}
Ofcourse INDEX isnt valid code..