Another basic question from an R newbie. I have a dataset: testMeanSD. Here is some relevant data, using dput() - my first time trying this for ouput, so I hope I have done it correctly:
testMeanSD <- structure(list(RT = c(1245L, 1677L, 1730L, 1066L, 994L), mean = c(1143.77777777778,
1143.77777777778, 1143.77777777778, 1143.77777777778, 1143.77777777778
), sd = c(202.255299928596, 202.255299928596, 202.255299928596,
202.255299928596, 202.255299928596), RT2 = c(1245L, 1677L, 1730L,
1066L, 994L)), .Names = c("RT", "mean", "sd", "RT2"), row.names = c(NA,
5L), class = "data.frame")
RT2 is just a duplicate of RT for me to modify. For each row, I need to alter the value of RT2 if it meets certain conditions. Otherwise RT2 stays the same as RT (or as the current value in RT2, which is the same thing). Here are the conditions:
find all values in RT2 that exceed the Mean + 2.5 * SD and trim them to be equal to the Mean + 2.5 * SD
if (RT2 > Mean + (2.5 * SD)) RT2 = Mean + 2.5 * SD
find all values that are less than the Mean - 2.5 times the SD and trim them to be equal to the Mean - 2.5 * SD
else if (RT2 < Mean - (2.5 * SD)) RT2 = Mean - 2.5 * SD
leave everything else as is
else
RT2 = RT
I thought this would be fairly basic in R, but I simply can't find a way to make it work. Here are some of my attempts (all failed):
First:
testMeanSD$RT2 = testMeanSD$RT
if (testMeanSD$RT2 > (testMeanSD$mean + (2.5 * testMeanSD$sd))) {
testMeanSD$RT2 = (testMeanSD$mean + (2.5 * testMeanSD$sd))
}
else if(testMeanSD$RT2 < (testMeanSD$mean - (2.5 * testMeanSD$sd))) {
testMeanSD$RT2 = (testMeanSD$mean - (2.5 * testMeanSD$sd))
}
else {
testMeanSD$RT2 = testMeanSD$RT
}
Second:
ifelse(testMeanSD$RT2 > (testMeanSD$mean + (2.5 * testMeanSD$SD)), testMeanSD$RT2 <- (testMeanSD$mean + (2.5 * testMeanSD$sd)),
ifelse(testMeanSD$RT2 < (testMeanSD$Mean - (2.5 * testMeanSD$sd)), testMeanSD$RT2 <- (testMeanSD$mean - (2.5 * testMeanSD$sd)), testMeanSD$RT2 <- testMeanSD$RT)
Third:
testMeanSD$RT2 <- ifelse(testMeanSD$RT2 > (testMeanSD$mean + (2.5 * testMeanSD$sd)), testMeanSD$mean + (2.5 * testMeanSD$sd)),
ifelse(testMeanSD$RT2 < (testMeanSD$mean - (2.5 * testMeanSD$SD)), (testMeanSD$mean - (2.5 * testMeanSD$sd)), testMeanSD$RT2 <- testMeanSD$RT)
I looked through some related posts, and this one seems closest: Loop over rows of dataframe applying function with if-statement
But it's not clear for me how to incorporate if then into the approaches outlined there (if not as I have them above).
Any help would be greatly appreciated. Thanks!