I wanted to run t-tests in a dataset that have two factors: A and B. I've got this working but without removing the outliers.
My idea is to remove values using the 1.5*IQR criteria. I've could not figure out how to do this following the dyplyr way (?).
Here is what I have:
wallSize %>%
select(Time, A, B) %>%
gather(key = variable, value = value, -A, -B) %>%
group_by(A, B, variable) %>%
summarise(value = list(value)) %>%
spread(A, value) %>%
group_by(variable) %>%
mutate(p_value = t.test(unlist(True), unlist(False), paired=TRUE)$p.value,
t_value = t.test(unlist(True), unlist(False), paired=TRUE)$statistic)))
I think I should do the outlier removal after the spread for each of the 6 lists individually but I can't figure it out how... Any suggestions from the R masters?
Cheers
EDIT: Sample data
head
of the frame before grouping:
Display Change Comp TargetType TotalTime SelectionTime Score
<chr> <chr> <chr> <int> <dbl> <dbl> <int>
1 Wall Shape False 1 62.2 53.7 4
2 Wall Shape False 2 14.1 12.6 5
3 Wall Shape True 0 26.3 23.0 5
4 Wall Shape True 0 20.3 14.7 5
5 Wall Shape True 1 23.3 21.6 5
6 Wall Shape False 2 6.55 5.17 5
after grouping:
TargetType variable False True
<int> <chr> <list> <list>
1 0 SelectionTime <dbl [28]> <dbl [28]>
2 1 SelectionTime <dbl [28]> <dbl [28]>
3 2 SelectionTime <dbl [28]> <dbl [28]>