2

I have a dataframe such as

COL1          COL2 
Canis_lupus   1_Group367,4_Group89,70_Group900
Danio_rerio   8_Group78
Felis_cattus  6_Group9,90_Group97

And I would like to transform it as :

COL1          Groups   Value
Canis_lupus   Group367 1
Canis_lupus   Group89  4
Canis_lupus   Group900 70
Danio_rerio   Group78  8
Felis_cattus  Group9   6
Felis_cattus  Group97  90 

Where I separate information in COL2 into two new columns Groups and Value, where groups are the GroupX elements, and Values are the number before the first _

Here is the dput format if it can helps:

structure(list(COL1 = c("Canis_lupus", "Danio_rerio", "Felis_cattus"
), COL2 = c("1_Group367,4_Group89,70_Group900", "8_Group78", 
"6_Group9,90_Group97")), class = "data.frame", row.names = c(NA, 
-3L))
camille
  • 16,432
  • 18
  • 38
  • 60
chippycentra
  • 3,396
  • 1
  • 6
  • 24
  • Does this answer your question? [tidyr: Separate a column into a variable number of columns](https://stackoverflow.com/questions/63969719/tidyr-separate-a-column-into-a-variable-number-of-columns) – Dan Adams Dec 30 '21 at 13:44

1 Answers1

4

A possible solution:

library(tidyverse)

df <- structure(list(COL1 = c("Canis_lupus", "Danio_rerio", "Felis_cattus"
), COL2 = c("1_Group367,4_Group89,70_Group900", "8_Group78", 
            "6_Group9,90_Group97")), class = "data.frame", row.names = c(NA, 
                                                                         -3L))
df %>% 
  separate_rows(COL2, sep=",") %>% 
  separate(COL2, into=c("Value", "COL2"),sep = "_", convert = TRUE)

#> # A tibble: 6 × 3
#>   COL1         Value COL2    
#>   <chr>        <int> <chr>   
#> 1 Canis_lupus      1 Group367
#> 2 Canis_lupus      4 Group89 
#> 3 Canis_lupus     70 Group900
#> 4 Danio_rerio      8 Group78 
#> 5 Felis_cattus     6 Group9  
#> 6 Felis_cattus    90 Group97
PaulS
  • 21,159
  • 2
  • 9
  • 26