0

I hope this isn't considered a duplicate question; I can't seem to figure it out. I have multiple tables that I'm trying to join together, by essentially doing a VLookup. That's how I would do it in Excel anyway.

I have 2 SQL data sets that are combined in this fashion.

Combined <- rbind(FGD, FNK)

In this Combined data set, I am trying to do a VLookup to the field named 'Category' and in another data set, named 'Category' I also have a field named 'Category'. This comes from an Excel file.

I tried this:

merge(Combined.Category, Category.Category, all=TRUE)

I just got this error.

Error in merge(Combined.Category, Category.Category, all = TRUE) : 
  object 'Combined.Category' not found

Isn't it like this? merge(Table1.Field1, Table2.Field2, all=TRUE)

Also, to make this a bit more complex, I want to do a VLookup to Category.Category and if no match is found, do another VLookup to Category.DES.

How can I do that? I'm pretty sure there is a way to do this, but I don't really know how to approach this kind of thing.

M--
  • 25,431
  • 8
  • 61
  • 93
ASH
  • 20,759
  • 19
  • 87
  • 200
  • 1
    Try: `merge(Combined, Category, by = "Category")` – pogibas Sep 26 '17 at 18:30
  • 2
    When asking for help you should include a [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input data and the desired output. – MrFlick Sep 26 '17 at 18:30
  • 1
    `merge(Combined, Category, by = "Category", all= T)` ... and **`.`** does not work in R like it does in SQL. Moreover, if the variables have different names in two dataframes you can use `by.x = "Category", by.y = "Cat"` instead of `by`. – M-- Sep 26 '17 at 18:32
  • Thanks, that makes sense. Now, I'm getting this: AllCombined <- merge('Combined', 'Category', by.x = 'Category', by.y = 'Category') Error in fix.by(by.x, x) : 'by' must specify a uniquely valid column Both are DataFrames: as.data.frame(Combined) & as.data.frame(Category). When I look at these data sets, it seems like by.x = 'Category' has no quotes and by.y = 'Category' has quotes around it. Maybe that's the issue...... – ASH Sep 26 '17 at 18:47

1 Answers1

3

You've got a few issues here.

1) You access the columns of a data.frame with $, not .

You seem to have two data.frames - Combined and Category. To access the Category column in each, use Combined$Category and Category$Category.

2) merge() expects data.frames, not columns. So as PoGibas pointed out, you'd want to do

merge(Combined, Category, by="?")

where ? is the name of the column that is common in the two data sets. Based on your description, it doesn't sound like this would be "Category" but there's not enough information for me to tell this. Let's say you have an ID column in both Combined and Category. Then you would do

merge(Combined, Category, by="ID")

3) Assuming you do have some id column or columns to link the two data sets, then the merged result has all columns from both data sets in it. Columns that appear in both data sets get a suffix added to them. The ones from the first data set will have "original_name.x" and the ones from the second will have "original_name.y"

Then you can make a new column that gets the value of Category$Category if it exists and the value of Category$DES if not:

mergeddata <- merge(Combined, Category, by="ID")
mergeddata$desired_value <- ifelse(is.na(mergeddata$Category.y),  
                                     mergeddata$Category.y, mergeddata$DES)

If you don't actually have NAs in your data, then you might need to change the condition in ifelse() to check for empty strings or some other value that indicates there isn't a valid Category.

cmaimone
  • 546
  • 4
  • 8
  • I think I see what the problem is here. Something like this should work fine: merge(Combined, Category, by = "Category", all= T). However...in the Category DF, the Category field is not unique. Concatenating the 'Category' field with the 'DES' field is unique. However...in the Combined DF, 'Category' and 'DES' are not concatenated. Outer, Left, and Right joins don't work and a Cross Join doesn't make any sense with this data set. Any additional thoughts on how to handle this? – ASH Sep 27 '17 at 18:34
  • Without actual data and a reproducible example, I'm not sure how to help you further – cmaimone Sep 27 '17 at 19:58