0

I have a compact data.frame or data.table containing row-wise info on ranges (dt.compact).

dt.compact = data.table(chr=c('chr1','chr1','chr2','chr2'),start = c(1,5,2,7), stop = c(3,7,3,8))

# Output dt.compact
 chr start stop
1: chr1     1    3
2: chr1     5    7
3: chr2     2    3
4: chr2     7    8

Now, I want a simple way to generate a wide data.frame or table with one row per position. Output should look like following:

# Output
do.call(data.table, list(V1 = c(rep('chr1', 6),rep('chr2', 4)), V2 = c(1:3, 5:7, 2:3, 7:8)))

     V1 V2
 1: chr1  1
 2: chr1  2
 3: chr1  3
 4: chr1  5
 5: chr1  6
 6: chr1  7
 7: chr2  2
 8: chr2  3
 9: chr2  7
10: chr2  8


Any suggestions how to achieve this? I thought about mapply (myOwnFunction, ...), but maybe there is already a built-in solution?

Any thoughts are wellcome

Danyou
  • 190
  • 2
  • 8

3 Answers3

1

You can do:

dt.compact[, .(chr, num = seq(start, stop)), by = 1:nrow(dt.compact)][, -1]

Output:

     chr num
 1: chr1   1
 2: chr1   2
 3: chr1   3
 4: chr1   5
 5: chr1   6
 6: chr1   7
 7: chr2   2
 8: chr2   3
 9: chr2   7
10: chr2   8

Edit: There is indeed a dupe for that, however here is a variation on the above approach provided by @jogo & not mentioned in the other topic:

dt.compact[, mapply(seq, start, stop), chr]
arg0naut91
  • 14,574
  • 2
  • 17
  • 38
0

I don't have a built-in solution, here is a tidyverse method:

dt.compact %>% 
  mutate(rng=map2(start, stop, ~.x:.y)) %>% 
  select(-start, -stop) %>% 
  unnest(cols=rng)
Kent Johnson
  • 3,320
  • 1
  • 22
  • 23
0

Try this:

dt.compact[, .(chr, seq.int(start, stop)), by = 1:nrow(dt.compact)][, nrow := NULL][]

     chr V2
 1: chr1  1
 2: chr1  2
 3: chr1  3
 4: chr1  5
 5: chr1  6
 6: chr1  7
 7: chr2  2
 8: chr2  3
 9: chr2  7
10: chr2  8

OR:

dt.new = dt.compact[, .(chr, seq.int(start, stop)), by = 1:nrow(dt.compact)][, nrow := NULL]
dt.new
    chr V2
 1: chr1  1
 2: chr1  2
 3: chr1  3
 4: chr1  5
 5: chr1  6
 6: chr1  7
 7: chr2  2
 8: chr2  3
 9: chr2  7
10: chr2  8

The column names can be changed easily.

Serhii
  • 362
  • 4
  • 15