0

I have a function that calculates 'incremental AUC(area under the curve)' when given certain values per time points (0, 15, 30, ... 120min). I want to apply this function to batch calculate my columns and ideally create a new 'list' that contains only the 'auc' values from each run, but having trouble coding this. I was thinking 'lapply' may work, but wonder if there are better suggestions since I would be creating similar functions and run them in batch in futures. Thank you so much guys. Below is the function where the data frame=df1, CAT.12 being one of the columns. X is the time while y being the variable (column).

i_auc.fn <- function(x,y) {
  auc <- ifelse(y[2] > y[1], (y[2]-y[1])*(x[2]-x[1])/2, 0)
  seg.type <- 0
  for (i in 3:length(x)) {
    if (y[i] >= y[1] & y[i-1] >= y[1]) {
      auc[i-1] <- (((y[i]-y[1])/2) + (y[i-1]-y[1])/2) * (x[i]-x[i-1])/2
      seg.type[i-1] <- 1
    } else if (y[i] >= y[1] & y[i-1] < y[1]) {
      auc[i-1] <- ((y[i]-y[1])^2/(y[i]-y[i-1])) * (x[i]-x[i-1])/2
      seg.type[i-1] <- 2
    } else if (y[i] < y[1] & y[i-1] >= y[1]) {
      auc[i-1] <- ((y[i-1]-y[1])^2/(y[i-1]-y[i])) * (x[i]-x[i-1])/2
      seg.type[i-1] <- 3
    } else if (y[i] < y[1] & y[i-1] < y[1]) {
      auc[i-1] <- 0
      seg.type[i-1] <- 4
    } else {
      # The above cases are exhaustive, so this should never happpen
      return(cat("i:", i, "Error: No condition met\n"))
    }
  }
  return(list(auc=sum(auc), segments=auc, seg.type=seg.type))
}
iAUC <- i_auc.fn(df1$time, df1$CAT.12)

my df1 looks like this

Chris Ahn
  • 13
  • 2
  • [See here](https://stackoverflow.com/q/5963269/5325862) on making a reproducible example that is easier for folks to help with. That includes a sample of data we can work with, not a picture of it. It's also unclear what you're trying to do here with this block of dense, unexplained code; try to keep in mind the *minimal* part of the [mcve] guidance. Is this entire function necessary to answer the question about using `lapply`? Paring it down to what's essential is the first step you need to do to debug, and makes it easier for others to follow – camille Jan 04 '22 at 01:47
  • Yes, my bad not being considerate on that. Thanks for letting me know. – Chris Ahn Jan 04 '22 at 03:48
  • Please clarify your specific problem or provide additional details to highlight exactly what you need. As it's currently written, it's hard to tell exactly what you're asking. – Community Jan 11 '22 at 10:51

1 Answers1

0

lapply() takes a list and a function as its input arguments to output a result. Here, you have a dataframe instead of a list as your input argument. Consequently, you cannot use lapply() with your data as it is. Here are two options I can offer:

Option 1: This is not the most elegant solution, admittedly, but it gets you this list output you desire. Simply loop through the columns of your dataframe expect for the time column and save each result as a new element of a list. Here is a reproducible example of that approach:

set.seed(450)

time<-seq(0,120,15)
CAT.01<-rnorm(9, 5, 2)
CAT.02<-rnorm(9, 5, 0.4)
CAT.03<-rnorm(9, 5, 0.22)
CAT.04<-rnorm(9, 5, 1.52)
CAT.05<-rnorm(9, 5, 1.5)
CAT.06<-rnorm(9, 5, 2.1)
CAT.07<-rnorm(9, 5, 3)

LST<-data.frame(time, CAT.01, CAT.02, CAT.03, CAT.04, CAT.05, CAT.06, CAT.07)

i_auc.fn <- function(x,y) {
  auc <- ifelse(y[2] > y[1], (y[2]-y[1])*(x[2]-x[1])/2, 0)
  seg.type <- 0
  for (i in 3:length(x)) {
    if (y[i] >= y[1] & y[i-1] >= y[1]) {
      auc[i-1] <- (((y[i]-y[1])/2) + (y[i-1]-y[1])/2) * (x[i]-x[i-1])/2
      seg.type[i-1] <- 1
    } else if (y[i] >= y[1] & y[i-1] < y[1]) {
      auc[i-1] <- ((y[i]-y[1])^2/(y[i]-y[i-1])) * (x[i]-x[i-1])/2
      seg.type[i-1] <- 2
    } else if (y[i] < y[1] & y[i-1] >= y[1]) {
      auc[i-1] <- ((y[i-1]-y[1])^2/(y[i-1]-y[i])) * (x[i]-x[i-1])/2
      seg.type[i-1] <- 3
    } else if (y[i] < y[1] & y[i-1] < y[1]) {
      auc[i-1] <- 0
      seg.type[i-1] <- 4
    } else {
      # The above cases are exhaustive, so this should never happpen
      return(cat("i:", i, "Error: No condition met\n"))
    }
  }
  return(list(auc=sum(auc), segments=auc, seg.type=seg.type))
}

OUT.LIST<-list()
for(i in 2:ncol(DF)){
  OUT.LIST[[i]]<-i_auc.fn(DF$time, DF[,i])
}

Option 2# Make your input a list first, and then use lapply(). Here is a reproducible example of that approach:

set.seed(450)

time<-seq(0,120,15)
CAT.01<-rnorm(9, 5, 2)
CAT.02<-rnorm(9, 5, 0.4)
CAT.03<-rnorm(9, 5, 0.22)
CAT.04<-rnorm(9, 5, 1.52)
CAT.05<-rnorm(9, 5, 1.5)
CAT.06<-rnorm(9, 5, 2.1)
CAT.07<-rnorm(9, 5, 3)

DF<-list(CAT.01, CAT.02, CAT.03, CAT.04, CAT.05, CAT.06, CAT.07)

i_auc.fn <- function(x,y) {
  auc <- ifelse(y[2] > y[1], (y[2]-y[1])*(x[2]-x[1])/2, 0)
  seg.type <- 0
  for (i in 3:length(x)) {
    if (y[i] >= y[1] & y[i-1] >= y[1]) {
      auc[i-1] <- (((y[i]-y[1])/2) + (y[i-1]-y[1])/2) * (x[i]-x[i-1])/2
      seg.type[i-1] <- 1
    } else if (y[i] >= y[1] & y[i-1] < y[1]) {
      auc[i-1] <- ((y[i]-y[1])^2/(y[i]-y[i-1])) * (x[i]-x[i-1])/2
      seg.type[i-1] <- 2
    } else if (y[i] < y[1] & y[i-1] >= y[1]) {
      auc[i-1] <- ((y[i-1]-y[1])^2/(y[i-1]-y[i])) * (x[i]-x[i-1])/2
      seg.type[i-1] <- 3
    } else if (y[i] < y[1] & y[i-1] < y[1]) {
      auc[i-1] <- 0
      seg.type[i-1] <- 4
    } else {
      # The above cases are exhaustive, so this should never happpen
      return(cat("i:", i, "Error: No condition met\n"))
    }
  }
  return(list(auc=sum(auc), segments=auc, seg.type=seg.type))
}

OUT.LIST<-lapply(LST, i_auc.fn, time)

There maybe an approach with using the dlply() and colwise() functions in the plyr:: package, but because you aren't splitting your data along the time series, the result is simply a one list element. Someone else may be able to find a way to make that work.

Sean McKenzie
  • 707
  • 3
  • 13