-2

I have a dataframe x which has two columns. Let's say they are A and B columns. A column are dates, B column are values that taken on that date. Both A and B column are continuous, no missing dates, and range from Jan to Oct in days. Right now I have another dataframe y which has column A and column C. A are also dates, C are values taken by another instrument on that specific date. However, A and C in this dataframe is not continuous (there are missing dates). For example, I have value on 1/1 but next value will be on 1/10. Now I would like to join values of C in dataframe y (discrete dates) to the first dataframe x which has continuos dates, according to the common A column, is there a efficient way to do this?

Xin Qiao
  • 11
  • 2
  • possible duplicate of [How to join data frames in R (inner, outer, left, right)?](http://stackoverflow.com/questions/1299871/how-to-join-data-frames-in-r-inner-outer-left-right) – Henrik Oct 07 '14 at 07:32
  • Sorry about the duplication. – Xin Qiao Oct 07 '14 at 19:17

1 Answers1

4

You could use merge from base R

 res1 <- merge(x,y, by="A", all.x=TRUE)
 dim(res1)
 #[1] 10  3

 head(res1,3)
 #          A          B         C
 #1 2011-04-03  1.3709584 0.2101654
 #2 2011-04-04 -0.5646982        NA
 #3 2011-04-05  0.3631284        NA

or left_join from dplyr (should be fast on big datasets)

 library(dplyr)
 res2 <- left_join(x,y, by="A") %>% #in case `date` column gets coerced to `numeric`
                        mutate(A= as.Date(A, origin='1970-01-01'))


 all.equal(res1, as.data.frame(res2))
 #[1] TRUE

Data

  set.seed(42)
  x <- data.frame(A=seq(as.Date("2011-04-03"), length.out=10, by=1), B=rnorm(10))
  set.seed(384)
  y <- data.frame(A=seq(as.Date("2011-04-03"), length.out=6, by=3), C=rnorm(6))
akrun
  • 874,273
  • 37
  • 540
  • 662