1

Objective

I have two datasets: df1 and df2

df1


Date           Name      Duration

1/2/2020       Tanisha   50
1/3/2020       Lisa      10
1/5/2020       Lisa      10



df2


Date           Name      Duration

1/2/2020       Tanisha   80
1/3/2020       Lisa      50
1/5/2020       Tom       10

Desired output:

  Date           Name      Duration        Date           Name       Duration

  1/2/2020       Tanisha   50              1/2/2020       Tanisha     80  
  1/3/2020       Lisa      10              1/3/2020       Lisa        50

I wish to match the contents in the name column with df1 and df2 as well as the Date column

Dput for df1 and df2:

 structure(list(Date = structure(1:3, .Label = c("1/2/2020", "1/3/2020", 
 "1/5/2020"), class = "factor"), Name = structure(c(2L, 1L, 1L
  ), .Label = c("Lisa", "Tanisha"), class = "factor"), Duration = c(50L, 
 10L, 10L), X = c(NA, NA, NA), X.1 = c(NA, NA, NA), X.2 = c(NA, 
 NA, NA), X.3 = c(NA, NA, NA)), class = "data.frame", row.names = c(NA, 
 -3L))



structure(list(Date = structure(1:3, .Label = c("1/2/2020", "1/3/2020", 
"1/5/2020"), class = "factor"), Name = structure(c(2L, 1L, 3L
), .Label = c("lisa", "tanisha", "tom"), class = "factor"), Duration2 = c(80L, 
50L, 10L)), class = "data.frame", row.names = c(NA, -3L))

What I have tried:

horizontal merge

 merge(df1, df2, all.x=True)

I am unsure how to match the Name AND Date contents

Any help is appreciated.

Matt
  • 7,255
  • 2
  • 12
  • 34
Lynn
  • 4,292
  • 5
  • 21
  • 44

1 Answers1

2

It is a simple merge but your Name columns are not uniform. Convert them to similar form (either upper, lower or title case) and then merge. Also no need to have duplicate columns for Date and Name since they carry the exact same information.

library(dplyr)
df1 %>% mutate(Name = tolower(Name)) %>% inner_join(df2, by = c('Date', 'Name'))

Or in base R :

merge(transform(df1, Name = tolower(Name)), df2, by = c('Date', 'Name'))


#      Date    Name Duration Duration2
#1 1/2/2020 tanisha       50        80
#2 1/3/2020    lisa       10        50
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
  • Hello @ronak, the columns in my real dataset are both uppercase and match. Can I just remove the 'to lower? df1 %>% mutate(Name = (Name)) %>% inner_join(df2, by = c('Date', 'Name')) ?? – Lynn Apr 01 '20 at 05:49
  • 1
    In that case you don't need `mutate`, do `inner_join(df1, df2, by = c('Date', 'Name'))` – Ronak Shah Apr 01 '20 at 05:56