I am creating a fictitious dataset that generates values duration
(below) based on a known discrete distribution (basic MC sampling). Each duration is assigned to a sequential id
number. A trivial example using rnorm()
might look like the following:
set.seed(135813) # whimsical seed
id_dt <- data.table(id = 1:6) # Six "id" numbers
duration_dt <- data.table(duration = abs(rnorm(6, mean = 20, sd = 10))) # Sample of six arbitrary positive values
id_durs <- id_dt[, .(id = id, duration = round(duration_dt$duration))] # combine the above DTs; round values to ints
For each duration
value in the id_durs
data table, I need to express the value as a sum of ones - that is, assigning a value of one (mapped to the id and original duration) in new rows until the number of ones created equals the original value. In this example we would start with:
id duration
-- --------
1 7
2 34
3 33
4 2
5 40
6 27
And the desired result is:
id duration count
-- -------- -----
1 7 1
1 7 1
1 7 1
1 7 1
1 7 1
1 7 1
1 7 1 <== duration = 7, Rows = 7
2 34 1
2 34 1
2 34 1
2 34 1
2 34 1
2 34 1
2 34 1
2 34 1
... ... ... <== duration = 34, Rows = 34
3 33 1
... ... ... <== duration = 33, Rows = 33
4 2 1
4 2 1 <== duration = 2, Rows = 2
5 40 1
... ... ... <== duration = 40, Rows = 40
6 27 1
... ... ... <== duration = 27, Rows = 27
One way I know to decompose a single value (verbose) is:
stuff = 50.4
decomp <- lapply(1:round(stuff), function(i) i <- 1)
result <- data.table(count = unlist(decomp))
But when trying to map this to id and original value, I'm hitting walls. I broke down and tried a for
loop as a crutch. Applied to the above:
for (i in 1:length(id_durs))
{
id_dur_val <- data.table(id = id_durs$id,
duration = id_durs$duration,
count = rep(1, each = id_durs$duration[i]))
}
But this just gives me a repetition equal to the number of elements in the original data. I also tried using expand.grid()
, but only the first element (as expected) was used as the iterator - so all row counts were the same for each value of duration
.
This feels like such a trivial problem, so I know I'm overlooking something.
Thank you for any advice.