Optimizing/alternatives to for-loops in R

Question

I'm fairly new to R and I'm coming from a C++ background, so I have a tendency to use for-loops but this seems to very slow in R. Here is a particular example:

dat1 <- cbind(dat1, data.frame(tot.hh = 0, below.18 = 0, above.18 = 0, below.65 = 0, above.65 = 0))
for(i in 1:length(dat1$hb030)){
  tmp <- subset(dat2, dat2$hb030 == dat1[i,]$hb030 & dat2$hb020 == dat1[i,]$hb020)
  dat1[i,]$tot.hh <- nrow(tmp)
  for(j in 1:length(tmp)){
    tmp.age <- 2006 - tmp[j,]$rb080
    ifelse(tmp.age<18, dat1[i,]$below.18 <- dat1[i,]$below.18+1, dat1[i,]$above.18 <- dat1[i,]$above.18+1)
    ifelse(tmp.age<65, dat1[i,]$below.65 <- dat1[i,]$below.65+1, dat1[i,]$above.65 <- dat1[i,]$above.65+1)
  }
}

The idea here is that there is one data set of households and one of personal data of individuals in the household and I'm trying to add information to households like how many members and their ages. My code works but takes forever (more than an hour for what is a fairly trivial computation). There are also some obvious inefficiencies like the subsetting but I haven't found a better way of doing this for now. I'm wondering if there is a vectorized approach to these kind of problems.

You could take a look at `apply` family of base R functions, or the functions in the `purrr` package which is part of the `tidyverse`. — Samuel, Oct 12 '17 at 21:30
Could you include a [minimal reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) in your question? — Samuel, Oct 12 '17 at 21:41

Optimizing/alternatives to for-loops in R

0 Answers0