I have two dataframes in R, one of which contains model outputs and the other contains model thresholds. That is, the outputs dataframe (call it df1
) looks something like this:
model1 model2 model3
0.086 0.2645728 0.0001668753
0.024 0.2109496 0.0001905100
0.052 0.2484194 0.0038053175
0.274 0.3650003 0.0002842775
0.260 0.4055953 0.0280523161
And the threshold dataframe (call it df2
) looks something like:
model threshold
model1 0.5520000
model2 0.7924895
model3 0.7537394
I want to apply the >=
operation to each entry in df1
where the column name is equal to the model name in df2
, and store these binaries in a new dataframe (call it df3
), which would be the same size as df1
. That is, df3
is the predicted label for each entry in df1
, given the corresponding model-based threshold in df2
. It's clear that I could do this in a brute force for-loop fashion like:
df3 = df1
for (mdl in df2$model) {
df3[, mdl] = df1[, mdl] >= df2$threshold[df2$model==mdl]
}
I don't like this solution, and I'm hoping there is a more R
-based way to perform this operation.
Reproducible Sample Data
df1 <- read.table(header = TRUE, text = "
model1 model2 model3
0.086 0.2645728 0.0001668753
0.024 0.2109496 0.0001905100
0.052 0.2484194 0.0038053175
0.274 0.3650003 0.0002842775
0.260 0.4055953 0.0280523161")
df2 <- read.table(header = TRUE, text = "
model threshold
model1 0.5520000
model2 0.7924895
model3 0.7537394")