merge only one or two columns from a different dataframe in R

Question

I have two sets of dataframes. Below are the first five lines for each.

First Data frame Name: sampel_sort
name                             id         supplier   usage
ABC                             10000079    811121     1
DEF                             10000182    541513     4
Supplier C                      10000484    531110     1
Supplier D                      10000523    541320     1
Supplier E                      10000592    524210     1
Supplier F                      10012711    237110     1

Second data frame Name: MBE
  id    State   total   CATEGORY
10000070    MD       5       MBE
10000182    PR       14      MBE
10000484    TX       1       MBE
10000526    MI       3       MBE
10000592    FL       1       MBE
10000680    ID       14      MBE

My actual dataset has lots more columns. I want to combine the two dataframes, but would like to import only the category column. the following merge statement works:

ncombined <- merge(x = sample_sort, y = MBE, by = "id", all.x = TRUE)

But this gives me all the columns from the MBE dataset. I tried the following in different ways (so that only the category column gets imported). But I am not having any luck. I get an error

ncombined <- merge(x = sample_sort, y = MBE[,c("CATEGORY")], by = "id", all.x = TRUE)

Error in fix.by(by.y, y) : 'by' must specify a uniquely valid column

The final result should be as follows:

First Data frame Name: sample_sort
name                             id         supplier   usage  CATEGORY
ABC                             10000079    811121     1       MBE
DEF                             10000182    541513     4       MBE
Supplier C                      10000484    531110     1       MBE
Supplier D                      10000523    541320     1       MBE
Supplier E                      10000592    524210     1       MBE
Supplier F                      10012711    237110     1       NA

First, could you please put a little more effort into making your question readable. I have no clue as to what is supposed to be code and what is commentary. Secondly, maybe `cbind()` could work? Again it's really hard to read and understand your question. — Chase Grimm, Jun 27 '16 at 19:37
How do you expect to merge by `id` when you're only selecting `CATEGORY`? — ytk, Jun 27 '16 at 19:40
if you subset `MBE` to only contain the `CATEGORY` column, there is no longer any `id` column to merge on — moman822, Jun 27 '16 at 19:40
`ncombined <- merge(x = sample_sort, y = MBE[,c("id","CATEGORY")], by = "id", all.x = TRUE)` or `merge(x = sample_sort, y = MBE[,c(1,4)], by = "id", all.x = TRUE)` — Chase Grimm, Jun 27 '16 at 19:41
thanks! that worked....i was missing the "id' for subset.... — jalsa, Jun 27 '16 at 20:05

score 2 · Answer 1 · answered Jun 27 '16 at 19:42

2

Try taking out the columns before merging, eg

ncombined <- merge(x = sample_sort, y = MBE[,c(1:4)], by = "id", all.x = TRUE)

answered Jun 27 '16 at 19:42

David

11,245
3
41
46

merge only one or two columns from a different dataframe in R

1 Answers1

Linked