0

I have a first data.frame d1 that contains 2 number (A and B) for each organism.

Organism1 <- c("name1", "name3", "name5") 
Number1 <- c("numberA1", "numberA3", "numberA5") 
Number2 <- c("numberB1", "numberB3", "numberB5") 
d1 <- data.frame(Organism1, Number1, Number2)
d1

I have a second data.frame d2 that contains the status of each organism

Organism2 = c("name1", "name2", "name3", "name4", "name5", "name6")
Status = c("Bad", "Good", "Neutral", "Good", "Good", "Bad")
d2 = data.frame(Organism2, Status)
d2

And I'd like to 'merge' these 2 data.frame to obtain a third one called d3 that corresponds to d1 + the Status column

Organism3 = c("name1", "name3", "name5") 
Number1 = c("numberA1", "numberA3", "numberA5") 
Number2 = c("numberB1", "numberB3", "numberB5") 
Status3 = c("Bad", "Neutral", "Good")
d3 = data.frame(Organism1, Number1, Number2, Status3) 
d3

The idea is just to add the status column to each organism in d1. For each d1 organism that is not in d2 just put NA in d3

I looked at the merge function but did not succeed to obtain what I want.

thbtmntgn
  • 151
  • 2
  • 8
  • It's moot now, but you should show what you tried with `merge`. That will help us understand where your confusion is and explain the parts you don't understand rather than just showing a solution. – Gregor Thomas May 04 '18 at 15:45

2 Answers2

2

Using merge you obtain this output:

merge(d1,d2,by.x="Organism1",by.y="Organism2", all.x=T)
  Organism1  Number1  Number2  Status
1     name1 numberA1 numberB1     Bad
2     name3 numberA3 numberB3 Neutral
3     name5 numberA5 numberB5    Good

But in your desired output d3 you have differents values in Status, what's your logic?

Terru_theTerror
  • 4,918
  • 2
  • 20
  • 39
0

full_join() in the dplyr library is made for this:

d1 %>%
  full_join(d2, by = c("Organism1" = "Organism2"))

left_join() returns something similar but with only the rows that appear in d1.

Steven
  • 3,238
  • 21
  • 50