0

Need some help with my R code please folks!

My table has two columns

  • a list of codes, with numerous codes in the same "cell" separated by commas
  • a description that applies to all of the codes in the same row

I want to split the values in the first column so that there is only 1 code per row and the corresponding description is repeated for every relevant code.

I really don't know where to start sorry, I actually don't really know what to search for!

Gilrob
  • 93
  • 7

1 Answers1

1

You can use separate_rows from tidyr:

library(tidyr)

separate_rows(df, numbers, convert = TRUE)

Or in base R, we can use strsplit:

s <- strsplit(df$numbers, split = ",")
output <- data.frame(numbers = unlist(s), descriptions = rep(df$descriptions, sapply(s, length)))

Output

numbers   descriptions                 
<int>     <chr>                        
1         This is a description for ID1
2         This is a description for ID2
3         This is a description for ID2
4         This is a description for ID2
5         This is a description for ID3
6         This is a description for ID3

Data

df <- tibble(
  numbers = c("1", "2,3,4", "5,6"),
  descriptions = c("This is a description for ID1", "This is a description for ID2", "This is a description for ID3")
)

# numbers descriptions                 
# <chr>   <chr>                        
# 1       This is a description for ID1
# 2,3,4   This is a description for ID2
# 5,6     This is a description for ID3
AndrewGB
  • 16,126
  • 5
  • 18
  • 49