1

I have been searching and searching and have to resolved to post! I'm still pretty new to R.

I have 2 data frames. The large one is HEAT and the small one is EE.

I have manage to do a left join to get EE matched up with HEAT.

df(HEAT)
Date Time.       EVENT.    Person.    PersonID
DTgroup1.         X.        Code.       Code
DTgroup2.         X         Code.       Code
DTgroup3.         Y.        Code.       Code
....

Then there is:

df(EE)
Person ID.       Type.      var 3.     var 4    var 5

here is the merge that I used:

merge <- left_join(HEAT, EE)

I have managed to merge the two data frames but I loose all the data in df(EE) except for the PersonID that it share with df(HEAT).

Does anyone have any advice about what I am doing wrong? Thanks a bunch!

mysteRious
  • 4,102
  • 2
  • 16
  • 36
Viss
  • 47
  • 9
  • 1
    How do you expect your output to be? What do you join on? `left_join` will try to find a match for all rows of HEAT inside EE. If there is a match it will be joined otherwise it will produce NAs. It doesn't care about EE rows that doesn't match with HEAT. You wanted a `full_join` maybe? – AntoniosK May 24 '18 at 21:45
  • so does that mean that I need to do a full_join on PersonID? to keep the rest of the data? – Viss May 24 '18 at 21:57
  • Yes, if you want to keep all data apart from the cases where you have a match you need a `full_join`. If you care more about EE data you can do a `right_join(HEAT, EE)` or `left_join(EE, HEAT)`. There are some nice links with info in the answers below. – AntoniosK May 24 '18 at 22:09

2 Answers2

0

A left join will keep all rows on the left side, in your case HEAT, and include data where there is a match on the right hand side.

An inner join, would only return records where there is a valid join on both sides, in your case, one record would be returned.

See What is the difference between “INNER JOIN” and “OUTER JOIN”? for more info.

dopple
  • 92
  • 10
0

Obviously, you want a

merge <- full_join(HEAT, EE)

Here is a nice Cheat sheet page http://stat545.com/bit001_dplyr-cheatsheet.html And here a super nice graphics http://r4ds.had.co.nz/relational-data.html

Gwang-Jin Kim
  • 9,303
  • 17
  • 30