I have a data set that I am trying to use to generate a different data set in R. The dataset has many columns; but the three relevant columns for generating the new data set are "Reach", "Results", and "DV". Reach and results are numeric. DV is binary with 0s and 1s. In the original dataset, all rows have DV = 0.
For each row of the original data set, I am attempting to take one variable "Reach" and replicate that row "reach" number of times. Then for this new set of rows, I want to change DV from 0 to 1 for "results" number (from the original row) of the new rows.
For example, in row 33 of the original data set: Reach = 1004, Results = 45, DV = 0. The new data set should have row 33 replicated 1004 times, for 45 of those new rows DV should be changed from 0 to 1.
The code I wrote for the task works... but it is taking 10+ hours to run because the file is so large. Any ideas for how to simplify this code so it can process more quickly
empty_new.video <- new.video[FALSE,]
for(i in 1:nrow(new.video)){
n.times <- new.video[i,'Reach'] #determine number of times to repeat rows
if (n.times > 0){
for (j in 1:n.times){
empty_new.video[nrow(empty_new.video) + 1 , ] <- new.video[i,]
}
}
dv.times <- new.video[i,'Results'] #creating dependent variable
if (dv.times>0){
for (k in 1:dv.times){
empty_new.video[nrow(empty_new.video) - n.times + k,'DV'] <- 1
}
}
}