I'm trying to reproduce this equation in R to do Kernel K-Means clustering:
But the loop that I created it's taking too long to finish, and I don't know how to improve it, here's is the example of the part of the code that is giving problem:
c=3
for (g in 1:c) {
ans = 0
for (k in 1:nrow(iris)) {
for (l in 1:nrow(iris)) {
ans = ans + (iris[k,'cluster']==g) *(iris[l,'cluster']==g)*kernelmatrix[k,l]
}
}
third[g] = ans
}
This is a pseudo code, because it's only a part of the full function, the expression (iris[l,'cluster']==g)
it's to verify if the element iris[l,'cluster']
belongs to cluster g
, and the kernelmatrix[k,l]
it's an element from the nxn
matrix of kernel operations.
I know that R
isnt' too good for loops, so I don't know how to improve it the loops.
EDIT: Here's the code with the kernelmatrix part, but I think that isnt't important to the code (where you all read data, can think that is any dataset like the iris for example:
## Euclidian Distance
# Remember:
#1.|| a || = sqrt(aDOTa),
#2. d(x,y) = || x - y || = sqrt((x-y)DOT(x-y))
#3. aDOTb = sum(a*b)
d<-function(x,y){
aux=x-y
dis=sqrt(sum(aux*aux))
return(dis)
}
##Radial Basis Function Kernel
# Remember :
# 1.K(x,x')=exp(-q||x-x'||^2) where ||x-x'|| is could be defined as the
# euclidian distance and 'q' it's the gamma parameter
rbf<-function(x,y,q=0.2){
aux<-d(x,y)
rbfd<-exp(-q*(aux)^2)
return(rbfd)
}
#
#calculating the kernel matrix
kernelmatrix=matrix(0,nrow(data),nrow(data))
for(i in 1:nrow(data)){
for(j in 1:nrow(data)){
kernelmatrix[i,j]=rbf(data[i,1:(ncol(data)-1)],data[j,1:(ncol(data)-1)],q)
}
}