0

My data are similar to the following data. Note some two surnames or names which I did not mentioned here

df1<-read.table(text= "id Surname Name
1234 Hamidsar Smith
139 Sandarscom Vicky
234 Bates May
100 Murphu Beki
941 Jool Susan
469 Sali John
990 susai Alison",header=TRUE)

The another data frame is as follows:

df2<-read.table(text= "id Surname Name
990 susai Anis
81B Rosak Roy
340 Molipoor Vicky
139 Bates May
941 Ameri David
990 susai Alison
139 Bates May
101 CICI Beki
139 Sandarscom Vicky
1234 Hamidsar Smith",header=TRUE)

I want to map id , surname and name to get the following table

id Surname Name Map
1234 Hamidsar Smith id,Surname,Name
139 Sandarscom Vicky id,Surname,Name
234 Bates May Surname, Name
100 Murphu Beki Name
941 Jool Susan id
469 Sali John NA
990 susai Alison id,Surname,Name

Is it possible to get it ? I read some links, but they did not help me

  • Does this answer your question? [How to join (merge) data frames (inner, outer, left, right)](https://stackoverflow.com/questions/1299871/how-to-join-merge-data-frames-inner-outer-left-right) – asachet Mar 02 '20 at 16:49
  • You're right, your expected output is not the result of a join. I am reading more carefully now but I cannot make sense of your expected output at all. Can you try to clarify what it means? – asachet Mar 02 '20 at 16:52
  • I still don't understand. susai Alison is matched for id, Surname and Name in both dataframes, so what is the logic for Surname to only show up for that case? – Phil Mar 02 '20 at 17:13
  • ok.. So what is the logic for Name not to show up? – Phil Mar 02 '20 at 17:18
  • No - 990 susai Alison shows up in both data frames, but you only want id and Surname to be given to that case, and not Name. Why? – Phil Mar 02 '20 at 17:19

1 Answers1

2
  library(tidyverse)
  library(glue)

First, I'm setting stringsAsFactors=FALSE to your code to produce the data to avoid pain down the road.

  df1<-read.table(text= "id Surname Name
1234    Hamidsar    Smith
139 Sandarscom  Vicky
234 Bates   May
100 Murphu  Beki
941 Jool    Susan
469 Sali    John
990 susai   Alison",header=TRUE, stringsAsFactors=FALSE)

  df2<-read.table(text= "id Surname Name
990 susai   Anis
81B Rosak   Roy
340 Molipoor    Vicky
139 Bates   May
941 Ameri   David
990 susai   Alison
139 Bates   May
101 CICI    Beki
139 Sandarscom  Vicky
1234    Hamidsar    Smith",header=TRUE, stringsAsFactors=FALSE)

Below I ensure id is set to character type so that it's the same in df2 and then I'm just applying a lookup to say "if this value is in df2, show the name of the variable, otherwise nothing". Then I'm using glue to put them together into a Map variable.

  df1 %>% 
    mutate(id = as.character(id),
           Map_id = if_else(id %in% df2$id, "id", NA_character_),
           Map_Surname = if_else(Surname %in% df2$Surname, "Surname", NA_character_),
           Map_Name = if_else(Name %in% df2$Name, "Name", NA_character_),
           Map = glue("{Map_id} {Map_Surname} {Map_Name}", .na = "") %>% 
             str_trim() %>% 
             str_replace_all(" ", ", ")) %>% 
    select(id, Surname, Name, Map)

#>     id    Surname   Name               Map
#> 1 1234   Hamidsar  Smith id, Surname, Name
#> 2  139 Sandarscom  Vicky id, Surname, Name
#> 3  234      Bates    May     Surname, Name
#> 4  100     Murphu   Beki              Name
#> 5  941       Jool  Susan                id
#> 6  469       Sali   John                  
#> 7  990      susai Alison id, Surname, Name
Phil
  • 7,287
  • 3
  • 36
  • 66