Basically a followup on this question.
I'm still trying to get a grasp of R's vectorising while trying to speed up a coworkers' code. I've read R inferno and Speed up the loop operation in R.
My aim is to speed up the following code, the complete dataset contains ~1000columns by 10.000-1.000.000 rows:
df3 <- structure(c("X", "X", "X", "X", "O", "O", "O", "O", "O", "O",
"O", "O", "O", "O", "O", "O"), .Dim = c(2L, 8L), .Dimnames = list(
c("1", "2"), c("pig_id", "code", "DSFASD32", "SDFSD56",
"SDFASD12", "SDFSD56342", "SDFASD12231", "SDFASD45442"
)))
score_1 <- structure(c(0, 0, 0, 0, 0, 0), .Dim = 2:3)
for (i in 1:nrow(df3)) {
a<-matrix(table(df3[i,3:ncol(df3)]))
if (nrow(a)==1) {
score_1[i,1]<-0 #count number of X (error), N (not compared) and O (ok)
score_1[i,2]<-a[1,1]
}
if (nrow(a)==2) {
score_1[i,1]<-a[1,1]
score_1[i,2]<-a[2,1]
}
if (nrow(a)==3) {
score_1[i,1]<-a[1,1]
score_1[i,2]<-a[2,1]
score_1[i,3]<-a[3,1]
}
}
colnames(score_1) <- c("N", "O", "X")
I have been trying myself but can't seem to figure it out yet. Here is what I've tried. It shows the same output as the code above, but I'm not sure if it actually does the same. I'm missing that bit of insight in R and my data set.
I can't seem to get my code get the same output as the for loop.
Edit: In response to Heroka's response I've updated my reproducible example:
Output of the for loop:
[,1] [,2] [,3]
[1,] 0 6 0
[2,] 0 6 0
output of the apply function:
1 2
[1,] 6 6