1

I have two data sets. One looks something like this:

df <- data.frame(Book=c(1,1,1,2,2,2,3,3,3), 
                 Character=c('Sally', 'Jessie', 'Linda', 'Sally', 'Jessie', 'Linda', 'Sally', 'Jessie', 'Linda'),
                 Target=c('Jessie', 'Linda', 'Sally', 'Jessie', 'Linda', 'Sally', 'Jessie', 'Linda', 'Sally'),
                 Word=c('time', 'good', 'like', 'time', 'good', 'like', 'time', 'good', 'like'),
                 Total=c(1,2,4,2,4,3,2,4,4))

And the other looks something like this:

df2 <- data.frame(Character=c('Sally', 'Jessie', 'Linda'),
                 Word=c('time', 'good', 'like'),
                 AllTotal=c(5,10,11))

What I am trying to do is merge the sets, keeping the same dimensions as df + 1 new column (AllTotal). For example, I want to show that Sally said the word 'time' to Jessie 1 time in Book 1 out of a total of 5 times throughout the dataset. The final table layout would like something like this:

df3 <- data.frame(Book=c(1,1,1,2,2,2,3,3,3), 
                 Character=c('Sally', 'Jessie', 'Linda', 'Sally', 'Jessie', 'Linda', 'Sally', 'Jessie', 'Linda'),
                 Target=c('Jessie', 'Sally', 'Jessie', 'Jessie', 'Sally', 'Sally', 'Jessie', 'Linda', 'Sally'),
                 Word=c('time', 'good', 'like', 'time', 'good', 'like', 'time', 'good', 'like'),
                 Total=c(1,2,4,2,4,3,2,4,4),
                 AllTotal=c(5,10,11,5,10,11,5,10,11))

I need some kind of merge or matching code, such that whenever 'word' from df2 matches 'word' from df, a new column is created in df (AllTotal) that places the corresponding AllTotal value from df2.

My actual datasets do not run in a consistent pattern the way my examples do, so I can't just repeat the AllTotal values the way I did in my examples.

I'm new to R and coding and so I don't know all of the proper terms. I hope what I'm asking makes sense from the examples I provided. I have tried various versions of merge, inner_join, and for loops, and can't seem to get the right answer. Since I don't know what the proper term is for what I'm trying to do, my searches have been unhelpful.

Thanks in advance for your help!

BLS
  • 59
  • 6
  • I'm pretty sure what you want is just an inner join, since `merge(df,df2,by=c("Character","Word"))` seems to match your `df3` result. Is there something going wrong which means this doesn't work? – thelatemail Feb 22 '19 at 00:21
  • I think this worked, although it rearranges my column order (not a big deal). I didn't know to use c() in by= and was trying to match using a crazy for loop when the merge didn't do what I wanted. Thank you! – BLS Feb 22 '19 at 00:28
  • 1
    Possible duplicate of [How to join (merge) data frames (inner, outer, left, right)?](https://stackoverflow.com/questions/1299871/how-to-join-merge-data-frames-inner-outer-left-right) – divibisan Feb 22 '19 at 01:03

1 Answers1

1

As @thelatemail notes in their comment, an inner join by Character and Word seems to address your primary question:

result <- merge(df,df2,by=c("Character","Word"))

You can re-order the column based on the value in the Book column using:

result <- result[order(result$Book),c(3,1,4,2,5,6)]

tacuba
  • 23
  • 6