-4

I have two dataframes of different dimensions, call them df1 and df2. They both have columns of integers called Product and Output.

df2 has the unique column of numbers called ActualOne. I want to append all the values of ActualOne from df2 to df1, where the Product and Output values of df1 are jointly matched with those in df2. For Product,Output pairs not found in df2, ActualOne is populated by NAs.

Attempt #1: Use data.table

library(data.table)
setDT(df1)
setDT(df2)

df1[df2, ActualOne := i.ActualOne, on=c("Product", "Output")]

I am by no means an expert at data.table. I didn't think anything had to be reformatted and thought the temporary storing as a data.table was better.

I received the following error:

Error in [.data.table(df1, df2, :=(ActualOne, i.ActualOne), : unused argument (on = c("Product", "Output"))

Attempt #2: Use a matching function

Off the top of my head something along the lines of:

df1$ActualOne <- df2[match(df1$Product,df2$Product) & match(df1$Output,df2$Output,"ActualOne"]

This results in the ActualOne column of df1 being populated by "ActualOne"

Thanks for your help.

David Arenburg
  • 91,361
  • 17
  • 137
  • 196
  • 4
    Which version of `data.table` do you have? The `on` feature works from *v1.9.6* or higher. – Jaap Oct 19 '15 at 18:49
  • 2
    Moreover: always try to give a [minimal reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/5963610). – Jaap Oct 19 '15 at 18:50
  • http://www.inside-r.org/packages/cran/data.table/docs/merge – Chris Oct 19 '15 at 18:51

3 Answers3

1

You have an older version of data.table. If you want your code to work you need to update to v1.9.6+ - Please check out the updates here.

For older versions, the idiomatic binary join + update would be to first key both data.tables and then join/update. Something like (we don't have actual data to test):

setkey(df1, Product, Output)
setkey(df2, Product, Output)
df1[df2, ActualOne := i.ActualOne]
David Arenburg
  • 91,361
  • 17
  • 137
  • 196
0

Here is one way to do it:

actualTF <- df1$Product == df2$Product & df1$Output == df2$Output
df1$ActualOne <- sapply(1:50, function(x) if (actualTF[x]) return(df2$ActualOne[x]) else NA)
smanchan
  • 1
  • 1
0

You want the merge() function I think. Here's an example per your description:

library(data.table)
df1 <- data.table(Product=1:10, Output=101:110)
df2 <- data.table(Product=1:3, Output=101:103, ActualOne=c('A', 'B', 'C'))
merge(df1, df2, by=c('Product', 'Output'), all.x=TRUE) 
DunderChief
  • 726
  • 4
  • 7