-3

For the below code to merge two table using Inner Join I am getting error

table_Left<-matrix(c(1:6,rep("Toaster",3),rep("Radio",3)),ncol = 2)
colnames(table_Left)<-c("Customer_ID","Product")
table_Left<-as.table(table_Left)
table_Left
table_Right<-matrix(c(2,4,6,rep("Alabama",2),"Ohio"),ncol = 2)
colnames(table_Right)<-c("Customer_ID","State")
table_Right<-as.table(table_Right)
table_Right
merge(x=table_Left, y=table_Right, by="Customer_ID")

Error: Error in fix.by(by.x, x) : 'by' must specify a uniquely valid column

Please advise the correction

Naveen
  • 1
  • 2
  • It will work if you replace `as.table` by `as.data.frame`. – markus Oct 28 '18 at 18:46
  • Ah! I was just about to post an answer with explanations when the question got closed – prosoitos Oct 28 '18 at 19:01
  • 1
    I don't think that it is a duplicate as the OP was trying to merge tables (not data frames). So I was wondering if you would consider re-opening the question @markus. Thank you! – prosoitos Oct 28 '18 at 19:03
  • Since I can't post my answer, I will just add here that you don't have to convert your matrices to data frames: `merge()` will work on data frames, but also on matrices since they can be coerced to data frames (have a look at `?merge`). So for your code to work, you only have to remove the lines in which you are coercing your matrices into tables. – prosoitos Oct 28 '18 at 19:11
  • That said, creating data frames (with `data.frame()`) rather than matrices when you are creating your data in the first place might be more suitable (unless you have a good reason to create matrices). – prosoitos Oct 28 '18 at 19:14
  • What lead to your confusion is the fact that we commonly refer to data frames as "tables". But `as.table()` will not create one such "table" (i.e. data frame). Rather, it will coerce your object to a contingency table, which is a very different thing. – prosoitos Oct 28 '18 at 19:19

1 Answers1

0

I think that your problem is due to a confusion around the term "table". Data frames are a very common class of objects when using R for data science. And, in common language, they are often referred to as "tables". The function as.table() that you used does not however have anything to do with data frames: as.table() creates contingency tables (which is not at all what you want here).

The most efficient way to create the 2 data frames (or "tables") that you want is is to create them directly with the function data.frame():

df_Left <- data.frame(
  Customer_ID = 1:6,
  Product = c(rep("Toaster", 3), rep("Radio", 3))
)

df_Left

      Customer_ID Product
    1           1 Toaster
    2           2 Toaster
    3           3 Toaster
    4           4   Radio
    5           5   Radio
    6           6   Radio

df_Right <- data.frame(
  Customer_ID = c(2, 4, 6),
  State = c(rep("Alabama", 2), "Ohio")
)

df_Right

      Customer_ID   State
    1           2 Alabama
    2           4 Alabama
    3           6    Ohio

And then your code with the merge() function will work:

merge(x = df_Left, y = df_Right, by = "Customer_ID")

  Customer_ID Product   State
1           2 Toaster Alabama
2           4   Radio Alabama
3           6   Radio    Ohio

Now, your code started with the creation of matrices. If you have a good reason, in your situation, to use matrices, merge() will also work:

If you look at the help file for the merge() function (with ?merge), you will see:

merge(x, y, ...)

x, y: data frames, or objects to be coerced to one.

And matrices can be coerced to data frames without creating any problem with your data. So you could also do:

ma_Left <- matrix(
  c(1:6, rep("Toaster", 3), rep("Radio", 3)), ncol = 2
)

colnames(ma_Left) <- c("Customer_ID", "Product")

ma_Right <- matrix(
  c(2, 4, 6, rep("Alabama", 2), "Ohio"), ncol = 2
)

colnames(ma_Right) <- c("Customer_ID", "State")

merge(x = ma_Left, y = ma_Right, by = "Customer_ID")

  Customer_ID Product   State
1           2 Toaster Alabama
2           4   Radio Alabama
3           6   Radio    Ohio
Community
  • 1
  • 1
prosoitos
  • 6,679
  • 5
  • 27
  • 41