-1

How can I merge the content two data.frames / tables without adding additional columns?

Example

Thanks!

DragonUS
  • 41
  • 4
  • Please don't post your data as an image. Instead type `dput(table1)` and `dput(table2)` in your R console and copy and paste their outputs into your question by editing it. You might want to also review this post for asking a reproducible question: https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – acylam Sep 29 '17 at 20:45
  • @useR Thanks for the advice! I appreciate it. – DragonUS Sep 29 '17 at 20:57

2 Answers2

1

This will work for you if you want to merge by column "ID" in your two datasets.

merge(x = TABLE1, y = TABLE2[ , c("Name", "Date", "ID")], by = "ID", all=TRUE)

I am sub-setting the second dataset while merging. If you want to merge by Name, Date and ID columns then you can do as below:

merge(x = TABLE1, y = TABLE2[ , c("Name", "Date", "ID")], by = c("Name", "Date", "ID"), all=TRUE)

#Output
  Name       Date      ID   Price
1 Green       <NA>  KF3902    NA
2  Pink       <NA>  F43346    NA
3  <NA> 2017-08-04     AMA     2
4  <NA> 2017-09-29    <NA>    33
Santosh M.
  • 2,356
  • 1
  • 17
  • 29
  • to get OP's desired output dataset, you actually want `all = TRUE` – acylam Sep 29 '17 at 20:43
  • I receive the following error message: "Error in merge.data.table(x = TABLE1, y = TABLE2[, "Name"], : Elements listed in `by` must be valid column names in x and y" - the names are exactly the same. – DragonUS Sep 29 '17 at 20:46
  • @useR corrected. Thanks for pointing out. – Santosh M. Sep 29 '17 at 20:48
  • I corrected to "all = TRUE". Still the same error. – DragonUS Sep 29 '17 at 20:49
  • @Tim It's better that you post our data in your question so we can reproduce your error. It's pretty hard to tell what's wrong when we don't have the same dataset you are using. – acylam Sep 29 '17 at 20:51
  • @Tim Just copy my `merge` line. You will get your desired result. It is returning error because column mentioned in `by` does not exist in TABLE2. You only extracted one column "Name" when you are editing the code `TABLE2[, "Name"]` – Santosh M. Sep 29 '17 at 20:53
  • @Tim As @useR suggested put the output you get of `dput(head(TABLE1))` and `dput(head(TABLE2))` in your question if you still getting errors and not getting desired output. – Santosh M. Sep 29 '17 at 20:58
1

Here is a solution with dplyr:

library(dplyr)
full_join(table1, table2, 
          by = c("Name", "Date", "ID")) %>%
  select(-Cost, -PNL) %>%
  rename(Item = ID)

Result:

   Name       Date Price   Item
1  <NA> 2017-09-29    33   <NA>
2  <NA> 2017-08-04     2    AMA
3  Pink       <NA>    NA F43346
4 Green       <NA>    NA KF3902

Notes:

My solution made some assumptions about the datasets table1 andtable2. For example, whether Date has already been converted to a date variable, and whether the Name and ID variables is character type instead of factor. I will therefore adjust my solution once the OP provides data.

Data:

table1 = data.frame(Name = as.character(c(NA, NA)),
                    Date = as.Date(c("2017-9-29", "2017-8-4")),
                    Price = c(33, 2),
                    ID = c(NA, "AMA"),
                    stringsAsFactors = FALSE)

table2 = data.frame(Name = c("Pink", "Green"),
                    Cost = c(25, 876),
                    PNL = c(11, 252),
                    Date = as.Date(c(NA, NA)),
                    ID = c("F43346", "KF3902"),
                    stringsAsFactors = FALSE)
acylam
  • 18,231
  • 5
  • 36
  • 45