I'm trying to make a custom function to correct some answer given the format, number set, and problem of the question. For that I have to identify those three variables from three different columns and then check the answer for the correspondent row.
Consider the next data frame as example of the one I'm using.
df = data.frame(ID = 1:10,
Formato_SCREE1 = c("FN", "FN", "FN", "PR", "PR", "FN", "PR", "PR", "FN", "PR"),
Problema_SCREE1 = c("UNIDAD", "DECIMA", "UNIDAD", "UNIDAD", "DECIMA", "DECIMA", "UNIDAD", "DECIMA", "DECIMA", "UNIDAD"),
Set_SCREE1 = c("SET4", "SET1", "SET4", "SET3", "SET3", "SET4", "SET3", "SET2", "SET1", "SET2"),
Resp_SCREE1 = c(0.7777778, 0.5000000, 0.7777778, 0.7142857, 2.5000000, 0.7777778, 0.7142857, 0.2857143, 0.3333333, 110.1111111))
ID Formato_SCREE1 Problema_SCREE1 Set_SCREE1 Resp_SCREE1
1 1 FN UNIDAD SET4 0.7777778
2 2 FN DECIMA SET1 0.5000000
3 3 FN UNIDAD SET4 0.7777778
4 4 PR UNIDAD SET3 0.7142857
5 5 PR DECIMA SET3 2.5000000
6 6 FN DECIMA SET4 0.7777778
7 7 PR UNIDAD SET3 0.7142857
8 8 PR DECIMA SET2 0.2857143
9 9 FN DECIMA SET1 0.3333333
10 10 PR UNIDAD SET2 110.1111111
My first solution to correct the answer was the following.
temp1 = df %>%
filter(Set_SCREE1 == "SET1") %>%
mutate(Error_SCREE1 = ifelse( Formato_SCREE1 == "PR" & Problema_SCREE1 == "DECIMA", Correct_answer_PR_DECIMA_SET1 - Resp_SCREE1,
ifelse( Formato_SCREE1 == "PR" & Problema_SCREE1 == "UNIDAD", Correct_answer_PR_UNIDAD_SET1 - Resp_SCREE1,
ifelse(Formato_SCREE1 == "FN" & Problema_SCREE1 == "DECIMA", Correct_answer_FN_DECIMA_SET1 - Resp_SCREE1,
ifelse(Formato_SCREE1 == "FN" & Problema_SCREE1 == "UNIDAD", Correct_answer_FN_UNIDAD_SET1 - Resp_SCREE1, 0))))) %>%
select(ID, Error_SCREE1)
And the correct answers:
Correct_answer_PR_DECIMA_SET1 = 1
Correct_answer_PR_UNIDAD_SET1 = 2
Correct_answer_FN_DECIMA_SET1 = 3
Correct_answer_FN_UNIDAD_SET1 = 4
This works fine but I have to repeat it four times (for each Set_SCREE1), and then repeat that four times more (for ...SCREE2, ...SCREE3, and ...SCREE4). That result in 16 chunks of code. Too many lines and something difficult to read.
Then I tried to make a function that could do the same in less lines, and end up with this:
error_calculator = function(data, set, formato, problema, correcta) {
temp = data %>%
filter(Set_SCREE1 == set) %>%
mutate(Error_SCREE1 = ifelse( formato == "PR" & problema == "DECIMA", correcta - Resp_SCREE1,
ifelse( formato == "PR" & problema == "UNIDAD", correcta - Resp_SCREE1,
ifelse( formato == "FN" & problema == "DECIMA", correcta - Resp_SCREE1,
ifelse( formato == "FN" & problema == "UNIDAD", correcta - Resp_SCREE1, 0))))) %>%
select(ID, Error_SCREE1)
return(temp)
}
temp1 = error_calculator(df, "SET1", "Formato_SCREE1", "Problema_SCREE1", "Correct_answer_PR_DECIMA_SET1")
The problem with this function is that is only working until the filter() line, and then no condition of the following ifelses is accomplished, so the Error_SCREE1 column is filled with "0". My first thought was that you can't pass column names to a function in the way I'm doing it. But according to this you actually can. So I can't really tell why the column names are not being recognized.