R - Add column to a first data.frame based on another data.frame sharing a similar column but with a different length

Question

I have a first data.frame d1 that contains 2 number (A and B) for each organism.

Organism1 <- c("name1", "name3", "name5") 
Number1 <- c("numberA1", "numberA3", "numberA5") 
Number2 <- c("numberB1", "numberB3", "numberB5") 
d1 <- data.frame(Organism1, Number1, Number2)
d1

I have a second data.frame d2 that contains the status of each organism

Organism2 = c("name1", "name2", "name3", "name4", "name5", "name6")
Status = c("Bad", "Good", "Neutral", "Good", "Good", "Bad")
d2 = data.frame(Organism2, Status)
d2

And I'd like to 'merge' these 2 data.frame to obtain a third one called d3 that corresponds to d1 + the Status column

Organism3 = c("name1", "name3", "name5") 
Number1 = c("numberA1", "numberA3", "numberA5") 
Number2 = c("numberB1", "numberB3", "numberB5") 
Status3 = c("Bad", "Neutral", "Good")
d3 = data.frame(Organism1, Number1, Number2, Status3) 
d3

The idea is just to add the status column to each organism in d1. For each d1 organism that is not in d2 just put NA in d3

I looked at the merge function but did not succeed to obtain what I want.

It's moot now, but you should show what you tried with `merge`. That will help us understand where your confusion is and explain the parts you don't understand rather than just showing a solution. — Gregor Thomas, May 04 '18 at 15:45

score 2 · Answer 1 · answered May 04 '18 at 15:38

2

Using merge you obtain this output:

merge(d1,d2,by.x="Organism1",by.y="Organism2", all.x=T)
  Organism1  Number1  Number2  Status
1     name1 numberA1 numberB1     Bad
2     name3 numberA3 numberB3 Neutral
3     name5 numberA5 numberB5    Good

But in your desired output d3 you have differents values in Status, what's your logic?

answered May 04 '18 at 15:38

Terru_theTerror

4,918
2
20
39

My mistake, just edited my post. Thanks for the answer! – thbtmntgn May 04 '18 at 15:44

score 0 · Answer 2 · answered May 04 '18 at 15:38

0

full_join() in the dplyr library is made for this:

d1 %>%
  full_join(d2, by = c("Organism1" = "Organism2"))

left_join() returns something similar but with only the rows that appear in d1.

answered May 04 '18 at 15:38

Steven

3,238
21
50

R - Add column to a first data.frame based on another data.frame sharing a similar column but with a different length

2 Answers2