You can use a combination of seq.POSIXt
to create a data.frame
with no missing time steps (object grid.
), and then use merge
to combine with the observed df
in my example.
This should solve your problem
# Create a sample data.frame missing every second observation.
df <- data.frame(date=seq.POSIXt(from=as.POSIXct("1970-01-01 00:00:00"), to=as.POSIXct("1970-01-01 10:00:00"), by="2 hours"), rainfall=rnorm(6))
#Create a seq of times without anything missing
grid. <- data.frame(date=seq.POSIXt(as.POSIXct("1970-01-01 00:00:00"), to=as.POSIXct("1970-01-01 10:00:00"), by="1 hours"))
# Merge them together keeping all the values from grid.
dat. <- merge(grid., df, by="date", all.x=TRUE)
To remove duplicated values you can either look for them and remove them using the duplicated
function.
# The ! means the reverse logic. Therefore TRUE becomes FALSE.
dup_index <- !duplicated(dat.[,1])
# Now re-create the dat. object with only non-duplicated rows.
dat. <- dat.[dup_index,]
The other way to do it is to use the aggregate
function. This could be useful if you have duplicates which are really two different observations and therefore you want the mean of the two, using;
dat. <- aggregate(dat.[,2], by=list(dat[,1]), FUN=mean)
HTH