-1

I have a code like below that contains two loops. The code reads monthly streamflow data and makes it as multi-replicate. The loops are so slow. I was wondering if there is any alternative way to make it faster?

library(xlsx)
library(data.table)

  a <- read.xlsx("streamflow.xlsx",sheetName = "Sheet1", header = TRUE)
  b=matrix(nrow=129792,ncol=17)
  b= data.frame(b)
  i=0

  for (j in -11:1236)
  {
   for (k in 1:104)
   {
    i=i+1
    j=j+12
    j[j > 1248] <-j-1248
    b[i,] <-a[j,]
   }
 }

Thanks

Heerbod
  • 53
  • 9
  • 1
    I can only see 2 loops. And what is `data.table` doing ? – SymbolixAU Aug 24 '17 at 00:55
  • 1
    Can you `dput(head(b))` so we can see the data and what's going on in the loop? There's probably a way to vectorize it – csgroen Aug 24 '17 at 00:55
  • 2
    Also, `dput(head(a))`. Help us help you by giving a [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example#5963610) – csgroen Aug 24 '17 at 00:57
  • 1
    @Wen: won't actually help. – Ben Bolker Aug 24 '17 at 01:14
  • Thanks @csgroen. a is reading a monthly time series of streamflow data such as St1 St2 . . . Aug 1913 1000 2000 Sep 1913 4000 5000 . . , and b is an empty matrix which will be filled with multi-replicate of streamflows data. I converted b t data frame because I had error in the loop with matrix format. – Heerbod Aug 24 '17 at 02:50

1 Answers1

0

I believe this is a proper translation of your double for-loop into vectorized code. It should increase the speed dramatically. Also, there is no need to declare b as a matrix and convert it to a data.frame, the values can just be obtained from a.

j_iter <- -11:1236
k_iter <- 1:104

k <- seq(12, length(k_iter) * 12, 12)
k <- rep(k, times=length(j_iter))

j <- rep(j_iter, each=length(k_iter))
j <- j + k
j[j > 1248] <- j[j > 1248] - 1248

b <- a[j,]
dvantwisk
  • 561
  • 3
  • 11