0

Here is a gist of what I want to do:

I've got 2 data frames:
x (id is unique)

id          timestamp
282462839   2012-12-05 10:55:00
282462992   2012-12-05 12:08:00
282462740   2012-12-05 12:13:00
282462999   2012-12-05 12:48:00

y (id is not unique)

id          value1    value2
282462839   300       100
282462839   300       200
282462839   400       300
282462999   500       400
282462999   300       150

I also have a function myfunc(id,pvalue) that computes something and returns one of the value2 values depending on pvalue and other value1s (more complicated than just pvalue==value1)

I want to create a 3rd column for x that contains the corresponding computed myfunc(id,pvalue), where pvalue is an integer that is constant(say 20).

so in essence, I want to do this:

x$t20 <- myfunc(x$id,20)

I tried using lappy and sapply this way:

x$t20 <- sapply(as.vector(x$id),myfunc,pvalue=20)

I tried using lapply and without the as.vector as well, but I kept getting this error:

Error in .pointsToMatrix(p2) : Wrong length for a vector, should be 2

It works when I just give mean where it just replicates $id in $t20.

How do I do this?

EDIT 1: Here's a skeleton of myfunc:

myfunc <- function(xid,pvalue) {
  result <- subset(y,id==xid)
  retVal <- -1
  if(nrow(result) < 12){
    return(NaN)
  }
  for(i in (1:nrow(result))){
    #code to process result
  }
  return(retVal)
}
orderof1
  • 11,831
  • 4
  • 18
  • 18
  • If your function `myfunc` is vectorized `x$t20 <- myfunc(x$id, 20)` should return what you want. – DrDom Jun 05 '13 at 05:58
  • If I do that, it gives me an error: Error in .pointsToMatrix(p2) : Wrong length for a vector, should be 2 In addition: Warning message: In id == xid : longer object length is not a multiple of shorter object length. my actual definition is myfunc(xid,pvalue) There is a line where i do: result <- subset(y,id==xid) I tried changing it to xid[1] but that still gave the vector length should be 2 error – orderof1 Jun 05 '13 at 06:00
  • It seems that the problem is in function, please edit your post and add the code of `myfunc`. – DrDom Jun 05 '13 at 06:45
  • Please give your complete function and make your code reproducible to enable testing. The `for` loop in your function makes me suspicious. – Roland Jun 05 '13 at 07:40

1 Answers1

1

It was very difficult to help without full code, but here are some tips. First you can obtain the logical vector of id's which should be processed, then use vectorized ifelse statment.

tmp <- table(y$id) >= 12
y$t20 <- ifelse(tmp[as.character(y$id)], your_new_func(), NaN)
DrDom
  • 4,033
  • 1
  • 21
  • 23
  • Oh! Since there was no detailed explanation in R about where the error occured I was assuming it was something to do with either passing the argument or the subset part! I think I found out where the error occured! It didnt strike me that it could be with the internals of the function! Thank you so much! – orderof1 Jun 05 '13 at 07:56
  • The error probably is caused by your code which you hid, because your original function works for me and retuns only a warning message. In R it is better to avoid `for` loop and use vectorized functions. – DrDom Jun 05 '13 at 08:05