I have thousands of txt files (1.txt; 2.txt; 3.txt...) to be used as input (predictions), and another file called "labels". I need to run few commands to create their respective outputs (AUC values). I am using the suggestion in a previous post (Looping through all files in directory in R, applying multiple commands).
But I am having trouble in creating my function to be included in this loop.
My original code (for 1 file predictions
):
library(ROCR)
labels <- read.table(file="/data/labels/labels", header=F, sep="\t")
predictions <- read.table(file="/data/input/3.txt", header=F)
pred <- prediction(predictions, labels)
perf <- performance(pred,"tpr","fpr")
auc <- attr(performance(pred ,"auc"), "y.values")
auc
write.table(auc, "/data/out/AUC3.txt",sep="\t")
My code so far (not working):
library(ROCR)
labels <- read.table(file="/data/labels/labels", header=F, sep="\t")
files <- list.files(path="/data/input/", pattern="*.txt", full.names=TRUE, recursive=FALSE)
auc <- function(r) {
pred <- prediction(files, labels)
perf <- performance(pred,"tpr","fpr")
auc <- attr(performance(pred ,"auc"), "y.values")
}
lapply(files, function(x) {
t <- read.table(x, header=F) # load file
out <- auc(t)
write.table(out, "/data/out/", sep="\t")
})
Error message:
Error in prediction(files, labels) :
Number of predictions in each run must be equal to the number of labels for each run.
Calls: lapply -> FUN -> auc -> prediction
Execution halted