Merging different length data sets in R by matching values in columns?

Question

I have two data sets. One looks something like this:

df <- data.frame(Book=c(1,1,1,2,2,2,3,3,3), 
                 Character=c('Sally', 'Jessie', 'Linda', 'Sally', 'Jessie', 'Linda', 'Sally', 'Jessie', 'Linda'),
                 Target=c('Jessie', 'Linda', 'Sally', 'Jessie', 'Linda', 'Sally', 'Jessie', 'Linda', 'Sally'),
                 Word=c('time', 'good', 'like', 'time', 'good', 'like', 'time', 'good', 'like'),
                 Total=c(1,2,4,2,4,3,2,4,4))

And the other looks something like this:

df2 <- data.frame(Character=c('Sally', 'Jessie', 'Linda'),
                 Word=c('time', 'good', 'like'),
                 AllTotal=c(5,10,11))

What I am trying to do is merge the sets, keeping the same dimensions as df + 1 new column (AllTotal). For example, I want to show that Sally said the word 'time' to Jessie 1 time in Book 1 out of a total of 5 times throughout the dataset. The final table layout would like something like this:

df3 <- data.frame(Book=c(1,1,1,2,2,2,3,3,3), 
                 Character=c('Sally', 'Jessie', 'Linda', 'Sally', 'Jessie', 'Linda', 'Sally', 'Jessie', 'Linda'),
                 Target=c('Jessie', 'Sally', 'Jessie', 'Jessie', 'Sally', 'Sally', 'Jessie', 'Linda', 'Sally'),
                 Word=c('time', 'good', 'like', 'time', 'good', 'like', 'time', 'good', 'like'),
                 Total=c(1,2,4,2,4,3,2,4,4),
                 AllTotal=c(5,10,11,5,10,11,5,10,11))

I need some kind of merge or matching code, such that whenever 'word' from df2 matches 'word' from df, a new column is created in df (AllTotal) that places the corresponding AllTotal value from df2.

My actual datasets do not run in a consistent pattern the way my examples do, so I can't just repeat the AllTotal values the way I did in my examples.

I'm new to R and coding and so I don't know all of the proper terms. I hope what I'm asking makes sense from the examples I provided. I have tried various versions of merge, inner_join, and for loops, and can't seem to get the right answer. Since I don't know what the proper term is for what I'm trying to do, my searches have been unhelpful.

Thanks in advance for your help!

I'm pretty sure what you want is just an inner join, since `merge(df,df2,by=c("Character","Word"))` seems to match your `df3` result. Is there something going wrong which means this doesn't work? — thelatemail, Feb 22 '19 at 00:21
I think this worked, although it rearranges my column order (not a big deal). I didn't know to use c() in by= and was trying to match using a crazy for loop when the merge didn't do what I wanted. Thank you! — BLS, Feb 22 '19 at 00:28
Possible duplicate of [How to join (merge) data frames (inner, outer, left, right)?](https://stackoverflow.com/questions/1299871/how-to-join-merge-data-frames-inner-outer-left-right) — divibisan, Feb 22 '19 at 01:03

tacuba · Answer 1 · 2019-02-22T00:49:11.297

1

As @thelatemail notes in their comment, an inner join by Character and Word seems to address your primary question:

result <- merge(df,df2,by=c("Character","Word"))

You can re-order the column based on the value in the Book column using:

result <- result[order(result$Book),c(3,1,4,2,5,6)]

edited Feb 22 '19 at 00:49

answered Feb 22 '19 at 00:42

tacuba

23
6

Merging different length data sets in R by matching values in columns?

1 Answers1