Vectorizing this function in R

Question

Hi so I have the following function:

kde.cv = function(X,s)    {
  l = length(X)

  log.fhat.vector = c()
  for (i in 1:l) {
    current.log.fhat = log ( kde(X[i],X[-i],s) )
    log.fhat.vector[i] = current.log.fhat
  }

  CV.score = sum(log.fhat.vector)

  return(CV.score)
}

I'd like to vectorize this without using any for loops or apply statements, can't seem to get around doing so. Help would be appreciated. Thanks.

EDIT: Given the responses, here are my answers to the questions posed.

Given requests for clarification, I will elaborate on the function inputs and on the user defined function inside the function given. So X here is a dataset in the form of a vector, specifically, a vector of length 7 in the dataset I used as an input to this function. The X I used this function for is c(-1.1653, -0.7538, -1.3218, -2.3394, -1.9766, -1.8718, -1.5041). s is a single scalar point set at 0.2 for the use of this function. kde is a user - defined function that I wrote. Here is the implementation:

kde = function(x,X,s){
  l = length(x)   
  b = matrix(X,l,length(X),byrow = TRUE)
  c = x - b 
  phi.matrix = dnorm(c,0,s)
  d = rowMeans(phi.matrix)

  return(d)
}

in this function, X is the same vector of data points used in kde.cv. s is also the same scalar value of 0.2 used in kde.cv. x is a vector of evaluation points for the function, I used seq(-2.5, -0.5, by = 0.1).

please provide a more complete example that is reproducible. include a value for X and s so that we can use it to test your function. Also documentation on what are you trying to achieve will help the community. — Juan Zamora, May 15 '18 at 02:38
@Juan Zamora I have edited my original post to give more details. In terms of what I'm doing with this code: kde is a function to fit a nonparametric density via a normal kernel on a dataset given by X using a vector of evaluation points x and bandwidth s. kde.cv is a function to conduct cross validation in order to choose an optimal bandwidth s. — Jaime Melara Sosa, May 20 '18 at 07:35

score 0 · Answer 1 · answered May 15 '18 at 02:53

0

Here is an option using sapply

kde.cv = function(X,s) 
    sum(sapply(1:length(X), function(i) log(kde(X[i], X[-i], s))))

answered May 15 '18 at 02:53

FKneip

368
3
12

The OP requests "I'd like to vectorize this without using any for loops or apply statements" – SymbolixAU May 15 '18 at 09:24
@SymbolixAU Without more information (in particular about what `kde` does) it's difficult to optimise the code. My answer should be faster than the original code attempt where `log.fhat.vector` is dynamically expanded in each step; the other answer also uses an apply method (`mapply` through `Vectorize`). – FKneip May 16 '18 at 00:16

C.C. · Answer 2 · 2018-05-21T23:30:59.633

For convenience, please provide a more complete example. For example, the kde() function. Is that a customized function?

Alternative to sapply, you can try Vectorize(). There are some examples you can find on stack overflow.

Vectorize() vs apply()

Here is an example

f1 <- function(x,y) return(x+y) 
f2 <- Vectorize(f1) 

f1(1:3, 2:4) 
[1] 3 5 7
f2(1:3, 2:4) 
[1] 3 5 7

and the second example

f1 <- function(x) 
{
 new.vector<-c()  
 for (i in 1:length(x)) 
 {
  new.vector[i]<-sum(x[i] + x[-i])
 }
 return(sum(new.vector))
}

f2<-function(x)
{
 f3<-function(y, i)
 {
  u<-sum(y[i]+y[-i])
  return(u)
 }
 f3.v<-Vectorize(function(i) f3(y = x, i=i))
 new.value<-f3.v(1:length(x))
 return(sum(new.value))
}

f1(1:3) 
[1] 24

f2(1:3) 
[1] 24

Note: Vectorize is a wrapper for mapply

EDIT 1

According to the response, I edited your kde.cv function.

kde.cv = function(X,s)    {
 l = length(X)

 log.fhat.vector = c()
 for (i in 1:l) {
  current.log.fhat = log ( kde(X[i],X[-i],s) )
  log.fhat.vector[i] = current.log.fhat
 }

 CV.score = sum(log.fhat.vector)

 return(CV.score)
}

kde = function(x,X,s){
 l = length(x)   
 b = matrix(X,l,length(X),byrow = TRUE)
 c = x - b 
 phi.matrix = dnorm(c,0,s)
 d = rowMeans(phi.matrix)

 return(d)
}


##### Vectorize kde.cv ######

kde.cv.v = function(X,s)   
{
 log.fhat.vector = c()

 kde.v<-Vectorize(function(i) kde(X[i], X[-i], s))

 CV.score <- sum(log(kde.v(1:length(X))))

 return(CV.score)
}

X<-c(-1.1653, -0.7538, -1.3218, -2.3394, -1.9766, -1.8718, -1.5041)
s<-0.2
x<-seq(-2.5, -0.5, by = 0.1)

kde.cv(X, s)
[1] -10.18278

kde.cv.v(X, s)
[1] -10.18278

EDIT 2

Well, I think the following function may match your requirement. BTW, since the little x is not used in your kde.cv, I just edited both two functions

kde.cv.2 <- function(X,s)    
{
 log.fhat.vector<-log(kde.2(X, s))
 CV.score = sum(log.fhat.vector)
 return(CV.score)
}

kde.2<-function(X, s)
{
 l <- length(X)  
 b <- matrix(rep(X, l), l, l, byrow = T)
 c <- X - b
 diag(c) <- NA
 phi.matrix <- dnorm(c, 0, s)
 d <- rowMeans(phi.matrix, na.rm = T)
 return(d)
}

X<-c(-1.1653, -0.7538, -1.3218, -2.3394, -1.9766, -1.8718, -1.5041)
s<-0.2 

kde.cv(X,s)
[1] -10.18278

kde.cv.2(X, s)
[1] -10.18278

Yes Kde() is a user defined function I wrote. I have edited my original post to give more details. — Jaime Melara Sosa, May 20 '18 at 07:31
@JaimeMelaraSosa So, in your `kde.cv` function, the first argument of `kde` you input is `X[i]`, which is a single value, while the first argument of `kde` should be a vector. — C.C., May 20 '18 at 17:58
@JaimeMelaraSosa I assume `kde.cv` may match your requirement. — C.C., May 20 '18 at 18:11
Thanks for the response. I realize X[i] is a single value, but isin't a single value simply a vector of length 1? I dont think this matters too much since the function gives me the correct answer anyways. In terms of your rewritten kde.cv function using vectorize(), i'm not sure this is what i am looking for. When i say "vectorize" the function, I mean writing it using only vector/matrix operations without the use of loops or apply statements. I'm not too familiar with vectorize(), but isin't similar to apply() in that it is prewritten using loops as well? — Jaime Melara Sosa, May 20 '18 at 21:19
@JaimeMelaraSosa Just a bit of strange consideration in my eye. BTW, the little `x` is not used, the `kde` function can use only two arguments. — C.C., May 21 '18 at 23:30

Vectorizing this function in R

2 Answers2