0

I have a dataframe (snipet shown below) to which I want to split the cells so I can assign other information to them and then paste them back together. The issue im having is splitting them up but keeping each row together if that makes sense. Here an example of what i have and what im trying to do;

current df
newcolumn      code        name           place
NA/NA          121/102      John/James    GBR/GBR
NA/NA          100/103      Harry/Peter   GBR/GBR
NA/NA          113/111      Will/Jamie   GBR/GBR
NA/NA          109/112      Brian/Steve    GBR/GBR

I now wish to seperate this df to something like this;

   newcolumn  code     name    place
    NA        121      John    GBR
    NA        101      James   GBR
    NA        100      Harry   GBR
    NA        103      Peter   GBR
    NA        113      Will    GBR
    NA        111      Jamie   GBR
    NA        109      Brian   GBR
    NA        112      Steve   GBR

Then after I have filled in my newcolumn I want to be able to past back together (maybe using a loop?) but this will be using row 1 and 2, 3 and 4 and so on

zx8754
  • 52,746
  • 12
  • 114
  • 209
Joe
  • 795
  • 1
  • 11
  • Why not fill in the "newcolumn" without splitting other columns? – zx8754 Jun 06 '23 at 09:35
  • 3
    @benson23 They want to split, update new column, then paste it back together. Linked post only the 1st step. – zx8754 Jun 06 '23 at 09:43
  • @Joe Can you explain more on *after I have filled in my newcolumn I want to be able to past back together*, or even better, give a final desired output (as indicated by zx8754, I assume your second table is only an intermediate product) – benson23 Jun 06 '23 at 09:56
  • @benson23, this question provides quite nice example for separating **multiple** columns though `lapply` + `strsplit` or `separate_longer_delim() ` oneliner. I.e. `mutate(df_, row_id = row_number(), .before = 1) %>% separate_longer_delim(newcolumn:place, "/")` for the first step and `summarise(df_long, across(newcolumn:place, ~ paste0(.x, collapse = "/")), .by = row_id)` for the last. – margusl Jun 06 '23 at 12:20

1 Answers1

1

A combination of strsplit() and lapply() gives the desired result for the 1st part:

df <- data.frame(
  newcolumn = rep("NA/NA", 4),
  code = c("121/102", "100/103", "113/111", "109/112"),
  name = c("John/James", "Harry/Peter", "Will/Jamie",
           "Brian/Steve"),
  place = c("GBR/GBR", "GBR/GBR", "GBR/GBR", "GBR/GBR")
)

df
#>   newcolumn    code        name   place
#> 1     NA/NA 121/102  John/James GBR/GBR
#> 2     NA/NA 100/103 Harry/Peter GBR/GBR
#> 3     NA/NA 113/111  Will/Jamie GBR/GBR
#> 4     NA/NA 109/112 Brian/Steve GBR/GBR

df_split <-
  as.data.frame(lapply(df, function(x) {
    unlist(strsplit(x, split = "/", fixed = TRUE))
  }))

df_split
#>   newcolumn code  name place
#> 1        NA  121  John   GBR
#> 2        NA  102 James   GBR
#> 3        NA  100 Harry   GBR
#> 4        NA  103 Peter   GBR
#> 5        NA  113  Will   GBR
#> 6        NA  111 Jamie   GBR
#> 7        NA  109 Brian   GBR
#> 8        NA  112 Steve   GBR

For the second part, a combination of mapply(), paste() and selecting alternating rows with seq() is one option:

df_split$newcolumn <- letters[seq_len(nrow(df_split))]

df_new <- mapply(paste,
                 df_split[seq(from = 1, to = nrow(df_split), by = 2), ],
                 df_split[seq(from = 2, to = nrow(df_split), by = 2), ],
                 SIMPLIFY = FALSE,
                 MoreArgs = list(sep = "/"))
df_new <- as.data.frame(df_new)

df_new
#>   newcolumn    code        name   place
#> 1       a/b 121/102  John/James GBR/GBR
#> 2       c/d 100/103 Harry/Peter GBR/GBR
#> 3       e/f 113/111  Will/Jamie GBR/GBR
#> 4       g/h 109/112 Brian/Steve GBR/GBR

Created on 2023-06-06 with reprex v2.0.2

rps1227
  • 472
  • 2
  • 5