I have a dataframe with a bunch of water quality measurements that needs to undergo some QA/QC by checking if values are above/below limit of detection/quantification thresholds. These thresholds vary by analyte (Ammonia, NOx, Orthophosphate, Urea, Total Nitrogen, Total Phosphorous), and there is the potential for more flagging criteria to be introduced in the future. For reporting purposes, the flags are simply an additional column that contains the relevant flag for a given sample (row), like "F1", "F2", etc. Right now here is how I am making this column:
ammonia_flaging <- function(value){
if (value <= 0.01){
"F1"
} else if (value > 0.01 & value <= 0.02){
"F2"
} else {
"x"
}
}
nh3_flags <- subset(data, Analyte.Name == "Ammonia", select= c(Sample.ID, avgConc)) %>%
mutate(flags = map(.$avgConc, ammonia_flaging))
nitrate_flaging <- function(value){
if (value <= 0.04){
"F1"
} else if (value > 0.04 & value <= 0.2){
"F2"
} else {
"x"
}
}
nox_flags <- subset(data, Analyte.Name == "NOx", select= c(Sample.ID, avgConc)) %>%
mutate(flags = map(.$avgConc, nitrate_flaging))
ortho_flaging <- function(value){
if (value <= 1){
"F1"
} else if (value > 1 & value <= 3){
"F2"
} else {
"x"
}
}
ortho_flags <- subset(data, Analyte.Name == "Orthophosphate", select= c(Sample.ID, avgConc)) %>%
mutate(flags = map(.$avgConc, ortho_flaging))
tp_flaging <- function(value){
if (value <= 1){
"F1"
} else if (value > 1 & value <= 5){
"F2"
} else {
"x"
}
}
tp_flags <- subset(data, Analyte.Name == "TP", select= c(Sample.ID, avgConc)) %>%
mutate(flags = map(.$avgConc, tp_flaging))
tn_flaging <- function(value){
if (value < 0){
"F3"
}
else {
"x"
}
}
tn_flags <- subset(data, Analyte.Name == "TN", select= c(Sample.ID, avgConc)) %>%
mutate(flags = map(.$avgConc, tn_flaging))
urea_flaging <- function(value){
if (value < 0){
"F3"
}
else{
"x"
}
}
urea_flags <- subset(data, Analyte.Name == "Urea", select= c(Sample.ID, avgConc)) %>%
mutate(flags = map(.$avgConc, urea_flaging))
all_flags <- rbind(ortho_flags, nox_flags, nh3_flags, tp_flags, tn_flags, urea_flags)
I then use merge to match these flags with their Sample ID in the original dataframe. I feel like there is a more elegant way to do this in much less lines of code, but this is the only way I could think to do it, as this is extensible and new flagging criteria can easily be introudced into a given analyte's respective function with another ifelse statement. I would appreciate any ideas