0

I have a data frame like this:

df <- data.frame(quadrant = rep(2:3, each = 3, times = 1),
                 ICD = c("S06|D11|A41|O34|O48",
                         "R55|A08|K40",
                         "N23|F13|K80|F19",
                         "R13|C18",
                         "E13|F19",
                         "D11|A41|N23|K80"))

I want to split up the string variable, generate a new row for each ICD while repeating the quadrant identifier.

Does somebody know a way how to handle this via dplyr::mutate?

mhovd
  • 3,724
  • 2
  • 21
  • 47
  • You probably want to use `dplyr::separate` (https://tidyr.tidyverse.org/reference/separate.html) – mhovd Jun 30 '22 at 11:52

2 Answers2

2

I would use separate_rows from tidyr:

library(tidyr)

df |>
  separate_rows(ICD, sep = "\\|")
harre
  • 7,081
  • 2
  • 16
  • 28
1

You can use the following code:

df <- data.frame(quadrant = rep(2:3, each = 3, times = 1),
                 ICD = c("S06|D11|A41|O34|O48",
                         "R55|A08|K40",
                         "N23|F13|K80|F19",
                         "R13|C18",
                         "E13|F19",
                         "D11|A41|N23|K80"))

library(dplyr)
library(tidyr)

df %>% 
  dplyr::mutate(ICD = strsplit(as.character(ICD), "|", fixed = TRUE)) %>%
  unnest(ICD)
#> # A tibble: 20 × 2
#>    quadrant ICD  
#>       <int> <chr>
#>  1        2 S06  
#>  2        2 D11  
#>  3        2 A41  
#>  4        2 O34  
#>  5        2 O48  
#>  6        2 R55  
#>  7        2 A08  
#>  8        2 K40  
#>  9        2 N23  
#> 10        2 F13  
#> 11        2 K80  
#> 12        2 F19  
#> 13        3 R13  
#> 14        3 C18  
#> 15        3 E13  
#> 16        3 F19  
#> 17        3 D11  
#> 18        3 A41  
#> 19        3 N23  
#> 20        3 K80

Created on 2022-06-30 by the reprex package (v2.0.1)

mhovd
  • 3,724
  • 2
  • 21
  • 47
Quinten
  • 35,235
  • 5
  • 20
  • 53