0

For each unique id, I'd like to calculate the maximum difference between each value of SkinTemp using lapply (or aggregate) of a series of curves (Time,SkinTemp) designated by a unique id in a data.frame.

So far I have the following but it returns a single value which is not right:

df<-data.frame(Time=seq(100),
               SkinTemp=rnorm(100,37,0.5),
               id=rep(1:10,each=10))



  maxDiff<-function(id,df) {
    a<-max(diff(df$SkinTemp))
    a
  }
maxA<-lapply(id,maxDiff,df) 

Any thoughts on why it doesn't retrieve a unique max value of SkinTemp for each id?

Edit:

Using aggregate no problem (I think it's ddplyr package)

aggregate(data=df,SkinTemp~id,function(x)max(diff(x)))

So what am I doing wrong with lapply?

HCAI
  • 2,213
  • 8
  • 33
  • 65
  • The normal aggregate approach would be `aggregate(SkinTemp ~ id, df, max)` – talat Feb 07 '18 at 19:09
  • 1
    So your `lapply` shouldn't even work because `id` is not defined except as a column in df. The *apply function you want is `tapply` except it doesn't work on data.frames (and thus your current function doesn't work with it) so you need the wrapper function `by`, but that function is awkward to use for the post part. The simplest version using `tapply` is `tapply(df$SkinTemp, df$id, function(x) max(diff(x)))` – Vlo Feb 07 '18 at 19:10

2 Answers2

0

The problem is related to your maxDiff function. The maxDiff function is applied over the entire dataset because an id is not specified. If you want to do a single id you need to specify it in the function before you call lapply.

maxDiff<-function(id,df) {
 a<-max(diff(df$SkinTemp[df["id"]==id]))
 a }

lapply(unique(id),maxDiff,df)

Or alternatively without a maxDiff you could specify the entire function in lapply/sapply like so

sapply(unique(df$id), function(x) max(diff(df$SkinTemp[df["id"]==x])))
jasbner
  • 2,253
  • 12
  • 24
  • Thank you for your answer, I'd like to accept this one as it's clearest for me to understand. Is there a way of catching or removing NA from either Time or SkinTemp within the function? – HCAI Feb 07 '18 at 20:00
  • you can use `is.na` to check for NA from the Time or Skin Temp and remove with something like `df[!is.na(df$Time),]` – jasbner Feb 07 '18 at 20:05
  • inside the function itself? I would prefer not to alter the initial data. – HCAI Feb 07 '18 at 20:08
  • Yes you can do this inside the function, check out https://stackoverflow.com/questions/4862178/remove-rows-with-nas-missing-values-in-data-frame – jasbner Feb 07 '18 at 20:20
  • 1
    Probably not the best example but here is a shot: `sapply(unique(df$id), function(x) max(diff(as.numeric(df[complete.cases(df),]$SkinTemp[df[complete.cases(df),]["id"]==x])))` – jasbner Feb 07 '18 at 20:24
0

If I read this correctly, you are trying to find the max of each id, for that you can use a for loop.

get_Max_skin_temp_per_id<-function(){
df<-data.frame(Time=seq(100),
           SkinTemp=rnorm(100,37,0.5),
           id=rep(1:10,each=10))
maxSkin<-vector()
for (ids in 1:10) {
  a<-as.numeric(max(df$SkinTemp[ids]))
  maxSkin<-c(maxSkin, rep(a, 10))
}
df$maxSkin<-maxSkin
return(df)
}
get_Max_skin_temp_per_id()

this will rep each id's max in a column after the id column. hope it helps. to just get a list of the items. change to:

get_Max_skin_temp_per_id<-function(){
df<-data.frame(Time=seq(100),
           SkinTemp=rnorm(100,37,0.5),
           id=rep(1:10,each=10))
maxSkin<-vector()
for (ids in 1:10) {
  a<-as.numeric(max(df$SkinTemp[ids]))
  maxSkin<-c(maxSkin, a)
}

return(a)
}
get_Max_skin_temp_per_id()
Michael Vine
  • 335
  • 1
  • 9
  • Thank you for your answer, I appreciate your time. I've decided to go with one further down as it is easier for me as a novice to read at first glance. – HCAI Feb 07 '18 at 20:01