0

I have a data frame that has a structure similar to the following:

df <- data.frame(sample_id = c("s1", "s2", "s3"), 
                 group = c("1", "2", "3"),
                 frequency = c("4", "2", "2"))

print(df)

sample_id    group   frequency
  s1           1         4      
  s2           2         2      
  s3           3         2

For further manipulations and compatibility with downstream functions, I am looking for a way to transform this data frame by expanding the number of rows based on the number indicated in the frequency column. The expected output should look like this:

sample_id    group   frequency
  s1           1         1
  s1           1         1
  s1           1         1
  s1           1         1      
  s2           2         1
  s2           2         1      
  s3           3         1
  s3           3         1

I'd appreciate any help towards this (the real data set I have is huge, and I could not figure out an efficient way to do so).

KST
  • 1
  • 2

1 Answers1

1

We could use uncount from tidyr package:

library(dplyr)
library(tidyr)

df %>% 
  type.convert(as.is = TRUE) %>% 
  uncount(frequency)
  sample_id group
1        s1     1
2        s1     1
3        s1     1
4        s1     1
5        s2     2
6        s2     2
7        s3     3
8        s3     3
TarJae
  • 72,363
  • 6
  • 19
  • 66