-3

I have 2 data frames. Those are as follows:

df1
Date    Duration
6/27/2014   10.00
6/30/2014   20.00
7/11/2014   15.00

and

df2
Date    Percent_Removal
6/27/2014   20.39
6/30/2014   27.01
7/7/2014    49.84
7/11/2014   59.48
7/17/2014   99.04

I want to merge these 2 data frames based on the 'Date' column in df1. The output should look like this:

df3
Date    Duration_sum    Percent_Removal
6/27/2014   10.00        20.39
6/30/2014   20.00        27.01
7/11/2014   15.00        59.48

I tried the following function:

df1$Date <- as.Date (df1$Date, format= "%m/%d/%Y")
df2$Date <- as.Date (df2$Date, format= "%m/%d/%Y")
df3<- as.data.frame (merge(df1,df2,by.x = "Date",all.x = TRUE))

My output is:

df3

 Date      Duration_sum   Percent_Removal
6/27/2014     10.00           NA
6/30/2014     20.00           NA
7/11/2014     15.00           NA

I will be highly grateful if someone can help me out with this problem. Thanks in advance.

Sami
  • 19
  • 1
  • 4
  • 1
    `merge.data.frame(df1,df2)` – dww Dec 29 '16 at 01:31
  • 1
    `df3 = merge(df1, df2, by="Date", all.x=TRUE)`. This will discard any rows in `df2` that don't match a `Date` in `df1`. If you wanted to keep all rows from both data frames, regardless of whether they have a match in the other data frame, you would use `all=TRUE` instead of `all.x=TRUE`. – eipi10 Dec 29 '16 at 01:31
  • (1) What are the results of `merge(df1, df2, by = "Date", all = TRUE)`? (2) Same questions with dates stored as character values. – Fr. Dec 29 '16 at 01:36
  • merge(df1, df2, by = "Date", all = TRUE) is still returning no values for df3$Percent_Removal.... same as my result for df3 in the question. – Sami Dec 29 '16 at 01:43

2 Answers2

0

You could be super lazy and avoid making a third df altogether:

 df1$Percent_removal<-df2$Percent_removal[df2$Date==df1$Date]

This will only be effective if you only have 1 instance of each date in each df. A more nuanced approach might be to involve the plyr package.

SeldomSeenSlim
  • 811
  • 9
  • 24
  • Getting error message: "longer object length is not a multiple of shorter object length" – Sami Dec 29 '16 at 01:51
0

This is too long for a comment, but really just need to demonstrate that the solution I gave in comments does work. If you are having problems with getting merge to work, then there must be some other issue with your data, which we cannot diagnose because you did not provide a dput of your data.frames

df1 = read.table(text = 
"Date    Duration
6/27/2014   10.00
6/30/2014   20.00
7/11/2014   15.00",
header = T)

df2 = read.table(text = 
"Date    Percent_Removal
6/27/2014   20.39
6/30/2014   27.01
7/7/2014    49.84
7/11/2014   59.48
7/17/2014   99.04",
header = T)

df1$Date <- as.Date (df1$Date, format= "%m/%d/%Y")
df2$Date <- as.Date (df2$Date, format= "%m/%d/%Y")

df3 = merge(df1,df2)
#         Date Duration Percent_Removal
# 1 2014-06-27       10           20.39
# 2 2014-06-30       20           27.01
# 3 2014-07-11       15           59.48

Note that no additional options need to be specified in the merge statement because

  1. The default value by = is the column names that are common to both data frames. In this case, only Date is shared.
  2. the default values of all.x, all.y and all give the desired behaviour where only the rows that are in both data frames are kept.
dww
  • 30,425
  • 5
  • 68
  • 111
  • I am so sorry, actually there was a small problem with my date values. I fixed it and now merge function works fine. Thank you so much for your help. – Sami Dec 29 '16 at 02:44