1

2 dfs, need to match "Name" with info$Name and replace corresponding values in details$Salary , df - details should retain all values and there should be no NAs(if match found replace the value if not found leave as it is)

details<- data.frame(Name = c("Aks","Bob","Caty","David","Enya","Fredrick","Gaby","Hema","Isac","Jaby","Katy"),
                     Age = c(12,22,33,43,24,67,41,19,25,24,32),
                     Gender = c("f","m","m","f","m","f","m","f","m","m","m"),
                     Salary = c(1500,2000,3.6,8500,1.2,1400,2300,2.5,5.2,2000,1265))

info <- data.frame(Name = c("caty","Enya","Dadi","Enta","Billu","Viku","situ","Hema","Ignu","Isac"),
                income = c(2500,5600,3200,1522,2421,3121,4122,5211,1000,3500))   

Expected Result :

Name      Age Gender Salary
Aks       12      f   1500
Bob       22      m   2000
Caty      33      m   2500
David     43      f   8500
Enya      24      m   5600
Fredrick  67      f   1400
Gaby      41      m   2300
Hema      19      f   5211
Isac      25      m   3500
Jaby      24      m   2000
Katy      32      m   1265     

None of the following is giving expected result

dplyr::left_join(details,info,by = "Name") 
dplyr::right_join(details,info,by = "Name") 
dplyr::inner_join(details,info, by ="Name") # for other matching and replace this works fine but not here
dplyr:: full_join(details,info,by ="Name")

All the results are giving NA's , tried using match function also but it is not giving desired result, any help would be highly appreciated

Sang won kim
  • 524
  • 5
  • 21
rajeswa
  • 47
  • 9

2 Answers2

2

You have Name in both the dataframe in different cases, we need to first bring them in the same case then do a left_join with them and use coalesce to select the first non-NA value between income and salary.

library(dplyr)

details %>% mutate(Name = stringr::str_to_title(Name)) %>%
  left_join(info %>% mutate(Name = stringr::str_to_title(Name)), by = "Name") %>%
  mutate(Salary = coalesce(income, Salary)) %>%
  select(names(details))

#       Name Age Gender Salary
#1       Aks  12      f   1500
#2       Bob  22      m   2000
#3      Caty  33      m   2500
#4     David  43      f   8500
#5      Enya  24      m   5600
#6  Fredrick  67      f   1400
#7      Gaby  41      m   2300
#8      Hema  19      f   5211
#9      Isac  25      m   3500
#10     Jaby  24      m   2000
#11     Katy  32      m   1265
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
  • just a followup question, suppose there is no uppercase lowercase issue , then would it work ? details %>% left_join(details,info,by ="Name") %>% mutate(Salary = coalesce(income, Salary)) %>% select(names(details)), giving error – rajeswa Jan 17 '20 at 07:19
  • @rajeshdhingra You are passing 3 dataframes to `left_join`, use `details %>% left_join(info,by ="Name") %>% mutate(Salary = coalesce(income, Salary)) %>% select(names(details))` – Ronak Shah Jan 17 '20 at 07:23
1

A base R solution:


matches <- match(tolower(details$Name), tolower(info$Name))
match <-  !is.na(matches)

details$Salary[match] <- info$income[matches[match]]

#Result
Name Age Gender Salary
1       Aks  12      f   1500
2       Bob  22      m   2000
3      Caty  33      m   2500
4     David  43      f   8500
5      Enya  24      m   5600
6  Fredrick  67      f   1400
7      Gaby  41      m   2300
8      Hema  19      f   5211
9      Isac  25      m   3500
10     Jaby  24      m   2000
11     Katy  32      m   1265

Humpelstielzchen
  • 6,126
  • 3
  • 14
  • 34