0

I have a dataset (nearly 26,000 rows) where I have the first and last names in each column as shown below:

Id       Name  Firstname1 Firstname2 Firstname3 
 1       AL      BE          GAM       ZET
2        IO      PA          TA        MA

I want all possible permutations of the first and last names in the result column as follows:

Id Name Firstname1 Firstname2 Firstname3 Result1 Result2...   Resultatn 
1  AL    BE       GAM         ZET    ALBEGAMZET ALBEZETGAM   GAMZETBEAL
2   IO    PA     TA           ME       IOPATAME  IOPAMETA     TAMEPAIO             
       

     

Thanks for your help!

I've tried this but I'm looking for help to generalize

df$result1<-paste(df$Name, df$Firstname1,df$Firstname2,df$Firstname3)

df$result2<-paste(df$Name, df$Firstname1,df$Firstname3,df$Firstname2)
Jon Spring
  • 55,165
  • 4
  • 35
  • 53
  • Do you mean any order of the four name segments (4 x 3 x 2 x 1 = 24 permutations), or are there other rules which eliminate some combinations (e.g. never starts with Firstname3) or add others (e.g. perhaps only three of the name segments could be used, not all four)? – Jon Spring Jun 12 '23 at 22:28
  • @JonSpring yes any order of the four name segment. – user22063262 Jun 12 '23 at 22:48
  • This question seems pretty similar, does this work for you? https://stackoverflow.com/a/49998848/6851825 – Jon Spring Jun 12 '23 at 22:51
  • @JonSpring It doesn't work for me. – user22063262 Jun 12 '23 at 23:11

1 Answers1

0

You can create the permutations of the columns with combinat::permn or gtools::permutations. Then, for each combination, you can paste the columns together in the specified oder.

df1 <- tribble(
  ~Id       ,~Name  ,~Firstname1 ,~Firstname2 ,~Firstname3, 
  1   ,    "AL"   ,   "BE"    ,      "GAM"   ,    "ZET",
  2   ,     "IO"  ,    "PA"    ,      "TA"    ,    "MA",
)

options <- combinat::permn(colnames(df1)[-(1:2)])

df2 <- map_dfc(options, ~ pmap_chr(df1[,c("Name", .x)], ~ paste0(..., collapse = ""))) %>%
  set_names(paste0("Result", 1:length(options))) %>%
  bind_cols(df1, .)

So for each option .x, we are selecting the columns of the dataframe in that order (df1[,c("Name", .x)]), and pasting their rows together.

  • The "Name" column should also be one of the columns to be swapped. I've modified the code slightly options <- combinat::permn(colnames(df1)[-1]) df2 <- map_dfc(options, ~ pmap_chr(df1[, .x], ~ paste0(..., collapse = ""))) %>% set_names(paste0("Result", 1:length(options))) %>% bind_cols(df1, .) Thanks a lot! @Ricardo Semião e Castro – user22063262 Jun 12 '23 at 23:44