0

I was already thinking about using the apply-family and vectorization but my calculation in the loop depend on random events of the loop iteration before. Hence, I cannot use the apply-family to speed up my code. I checked with profvis which parts of my loop take relatively long and as I fill a data table with 25 columns in each loop step, this process takes a while. So my question is a more general one, what is generally the fastest way to fill a data table? I pre-define the data table an its dimensions before. As I do not know the exact amount of rows (as it's depends on a random stochastic process) I create a data table with the maximum possible number of observations as rows and delete the not used rows afterwards outside the loop. Here a short example to show how I generally write into my data table in the loop. The example can easily be vectorized but I did not want to post my huge junk of code as the point of interest is just the writing in the data table.

require(data.table)

myDT <- data.table(matrix(NA, nrow = 1000, ncol = 20))
colnames(myDT) <- c("Col1", "Col2", "Col3", "Col4", "Col5", "Col6", "Col7", "Col8", "Col9", "Col10", "Col11", "Col12", "Col13", "Col14",
                    "Col15", "Col16", "Col17", "Col18", "Col19", "Col20")

for (x in 1:800){

  # Here are normally all my dependent calculations on the events of the loop iteration before.
  # To not post a huge junk of code I will just fill the data table with x to show how I write into the data table.

  myDT$Col1[x] <- x
  myDT$Col2[x] <- x
  myDT$Col3[x] <- x
  myDT$Col4[x] <- x
  myDT$Col5[x] <- x
  myDT$Col6[x] <- x
  myDT$Col7[x] <- x
  myDT$Col8[x] <- x
  myDT$Col9[x] <- x
  myDT$Col10[x] <- x
  myDT$Col11[x] <- x
  myDT$Col12[x] <- x
  myDT$Col13[x] <- x
  myDT$Col14[x] <- x
  myDT$Col15[x] <- x
  myDT$Col16[x] <- x
  myDT$Col17[x] <- x
  myDT$Col18[x] <- x
  myDT$Col19[x] <- x
  myDT$Col20[x] <- x
}

myDT <- na.omit(myDT)

Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
sh_student
  • 369
  • 2
  • 14

0 Answers0