2
library(data.table)

I am trying to do this.

wd <- structure(list(Year = c(2006L, 2006L, 2006L), day = c(361L, 361L, 
360L), hour = c(14L, 8L, 8L), mint = c(30L, 0L, 30L), valu1 = c(0.5, 
0.3, 0.4), Date = structure(c(1167229800, 1167206400, 1167121800
), class = c("POSIXct", "POSIXt"), tzone = "UTC")), .Names = c("Year", 
"day", "hour", "mint", "valu1", "Date"), row.names = c(NA, -3L
), class = "data.frame")

wg <- c("2006/12/27 14:23:59", "2006/12/27 16:47:59", "2006/12/27 19:12:00")
w <- c("0.4", "0.2", "0.5")
wf=data.frame(wg,w)
wg <- as.POSIXct(wf$wg, format = "%Y/%m/%d %T", tz = "UTC") 
WG <- data.table(start = wg, end = wg)
setkey(WG)
## Do the same for `wd` adding +/- 30 minutes 
setDT(wd)[, `:=`(start = Date - 1800L, end = Date + 1800L)]
## Run foverlaps and extract the match `valu1` column
foverlaps(wd, WG, nomatch = 0L)[, .(wdDate = Date, valu1, WGDate = start)]
                    wdDate valu1              WGDate
   1: 2006-12-27 14:30:00   0.5 2006-12-27 14:23:59

As you can see in the final results only valu1 was extracted from wd but I would like also to extract the corresponding values from w in wf.

So I want something like this:

                     wdDate valu1              WGDate      w
    1: 2006-12-27 14:30:00   0.5 2006-12-27 14:23:59      0.4    

Any idea is welcome.

Real data:

  head(wf)
       date1  date2        date3n               wg         w    whyt
1     <NA> 2003-01-01      <NA>                <NA>        NA   NA
2     <NA> 2003-01-02      <NA>                <NA>        NA   NA
3     <NA> 2003-01-03      <NA> 2003/01/03 10:30:00 0.2137352 0.34
4     <NA> 2003-01-04      <NA>                <NA>        NA   NA

Facing a problem here:

Uwe Keim
  • 39,551
  • 56
  • 175
  • 291
sacvf
  • 2,463
  • 5
  • 36
  • 54

1 Answers1

5

In my previous answer I've created WG because you provided wg as a single vector. If you already have a data set called wf, this whole proccess is not needed. You just need to adjust wf correctly and then run foverlaps. In other words, forget about WG and do the following

setDT(wf)[, wg := as.POSIXct(wg, format = "%Y/%m/%d %T", tz = "UTC")]
wf[, `:=`(start = wg, end = wg)]
setkey(wf, start, end)
setDT(wd)[, `:=`(start = Date - 1800L, end = Date + 1800L)]

foverlaps(wd, wf, nomatch = 0L)[, .(wdDate = Date, valu1, WGDate = start, w)]
#                 wdDate valu1              WGDate   w
# 1: 2006-12-27 14:30:00   0.5 2006-12-27 14:23:59 0.4
Community
  • 1
  • 1
David Arenburg
  • 91,361
  • 17
  • 137
  • 196
  • I've added ````setDT(wd)[, `:=`(start = Date - 1800L, end = Date + 1800L)]```` back, I thought you already did that line. Everything up until (including) `wf=data.frame(wg,w)` is the same as in your post so I didn't repost it again. – David Arenburg May 11 '15 at 12:36
  • Which column has NAs? `wg`? – David Arenburg May 11 '15 at 12:46
  • 1
    `w` doesn't matter because it is not being joined on. Try removing `NA`s from `wf` using `wf <- na.omit(wf, by = "wg")` and then run `foverlaps` – David Arenburg May 11 '15 at 12:53
  • The code doesn't search for *exactly* +-30, it takes everything within that range exactly like you need. There is a `mult = "all"` argument that you can add to `foverlaps` but this is should be already the default. Can you please provide a reproducible example and show what would you expect to get? I'm starting to wonder though if you trying to match `wf` to `wd` ranges or the other way around. – David Arenburg May 11 '15 at 13:06
  • Sorry, do `wf <- na.omit(wf, cols = "wg")` (Use `cols` instead of `by`. I wrote the code without testing in my previous comment). – David Arenburg May 11 '15 at 13:13
  • 1
    If you have *real* `NA`s, `wf <- na.omit(wf, cols = "wg")` will work. For your second question, you can calculate means after words. It is a very simple aggregation task using `data.table`. – David Arenburg May 11 '15 at 13:43
  • Per your edit, you want remove one row, or to assign the average to both rows? – David Arenburg May 11 '15 at 13:47
  • 1
    Try `unique(res[, valu1 := mean(valu1), by = WGDate], by = "WGDate")` – David Arenburg May 11 '15 at 13:52