R: t.test and pairwise.t.test give different results?

Question

I tried to do a t-test with R over the following dataframe.

df <- structure(list(freq = c(9, 11, 14, 12, 10, 9, 16, 10, 11, 15, 
13, 12, 12, 13, 13, 9, 16, 14, 12, 15, 16, 10, 11, 13, 14, 14, 
14, 16, 8, 10, 14, 14, 11, 11, 11, 11, 13, 7, 12, 13, 14, 11, 
11, 13, 10, 14, 10, 10, 12, 8, 9, 12, 14, 11, 12, 12, 14, 14, 
14, 15, 12, 13, 14, 8, 9, 11, 10, 14, 12, 12, 9, 10, 8, 14, 11, 
14, 9, 13, 13, 13, 10, 9, 13, 10, 13, 10, 13, 12, 11, 12, 10, 
12, 8, 11, 12, 15, 12, 12, 11, 13, 12, 10, 13, 9, 11, 9, 11, 
8, 12, 12, 12, 10, 11, 12, 9, 13, 14, 11, 11, 14, 13, 12, 14, 
15, 12, 12, 12, 14), class = structure(c(3L, 3L, 2L, 2L, 2L, 
2L, 2L, 3L, 2L, 3L, 4L, 4L, 4L, 4L, 3L, 2L, 3L, 2L, 1L, 4L, 1L, 
4L, 1L, 4L, 2L, 2L, 3L, 3L, 2L, 4L, 1L, 4L, 4L, 4L, 3L, 3L, 3L, 
2L, 1L, 4L, 3L, 3L, 1L, 4L, 1L, 2L, 2L, 3L, 3L, 4L, 2L, 2L, 3L, 
3L, 2L, 2L, 2L, 1L, 1L, 2L, 1L, 1L, 4L, 1L, 1L, 1L, 2L, 2L, 3L, 
2L, 3L, 2L, 3L, 3L, 4L, 2L, 1L, 4L, 1L, 1L, 3L, 2L, 2L, 2L, 3L, 
1L, 1L, 1L, 1L, 3L, 4L, 4L, 4L, 4L, 4L, 1L, 1L, 1L, 3L, 3L, 4L, 
4L, 3L, 4L, 4L, 4L, 4L, 3L, 3L, 1L, 4L, 4L, 1L, 4L, 4L, 1L, 3L, 
1L, 2L, 2L, 1L, 2L, 1L, 1L, 3L, 3L, 2L, 1L), .Label = c("ending", 
"mobile", "stem.first", "stem.second"), class = "factor")), .Names = c("freq", 
"class"), row.names = c(NA, -128L), class = "data.frame")

As I read in a previous post there is more than one way to do this in R. I tried both with using the t.test-function and with using the pairwise.t.test-function.

For using t.test I subsetted the dataframe by the classes to be compared and ran subsequent t-tests over the subsets.

ending.vs.mobile <- df[df$class=="ending"|df$class=="mobile",]
ending.vs.first <- df[df$class=="ending"|df$class=="stem.first",]
ending.vs.second <- df[df$class=="ending"|df$class=="stem.second",]
mobile.vs.first <- df[df$class=="mobile"|df$class=="stem.first",]
mobile.vs.second <- df[df$class=="mobile"|df$class=="stem.second",]
first.vs.second <- df[df$class=="stem.first"|df$class=="stem.second",]

t.test(ending.vs.mobile$freq ~ ending.vs.mobile$class, var.equal=T) 
t.test(ending.vs.first$freq ~ ending.vs.first$class, var.equal=T) 
t.test(ending.vs.second$freq ~ ending.vs.second$class, var.equal=T) 
t.test(mobile.vs.first$freq ~ mobile.vs.first$class, var.equal=T) 
t.test(mobile.vs.second$freq ~ mobile.vs.second$class, var.equal=T) 
t.test(first.vs.second$freq ~ first.vs.second$class, var.equal=T)

As far as I have understood it (here I might be wrong) the pairwise.t.test would be more convenient here, as I don't need to create all the subsets and can run it over the original dataframe.

pairwise.t.test(df$freq, df$class, p.adjust.method="none", paired=FALSE, pooled.sd=FALSE)

However I get different results here, most pronounced for the comparison ending vs. stem.second: p=0.7 using t.test and p=0.1 using pairwise.t.test.

What's wrong here? Where have I done sth. wrong?

Although the problem itself is solved, I think the reason why it occurred, makes me a little paranoid (not trusting myself anymore): Just by typing pooled.sd instead of pool.sd I do not get the results I expect. Isn't this very prone to errors?

In many other cases you can type variants, e.g. bonf or bonferroni, fa() or factor(), and so on. But here pooled.sd is completely ignored although "pooled sd" is actually intended. Ok, if you thoroughly read the headline of the output you can guess that pooled.sd wasn't recognized as it still says "t tests with pooled SD" but what if I don't even print this, e.g. when piping the output to a self-written function? There are chances that this error will never be recognized.

Should I write to some developers of R, that in future releases of R both spelling variants should be valid?

score 6 · Accepted Answer · edited Jun 18 '19 at 01:45

6

The problem is not in the p-value correction, but in the (declaration of the) variance assumptions. You have used var.equal=T in your t.test calls and pooled.sd=FALSE in your pairwise.t.test calls. However, the argument for pairwise.t.test is pool.sd, not pooled.sd. Changing this gives p-values equivalent to the individual calls to t.test

pairwise.t.test(df$freq, df$class, p.adjust.method="none", 
                paired=FALSE, pool.sd=FALSE)

edited Jun 18 '19 at 01:45

busybear

10,194
1
25
42

answered Jul 12 '12 at 18:03

Brian Diggs

57,757
13
166
188

Thank you very much. That solved it. I erroneously typed pooled.sd not pool.sd. – absurd Jul 12 '12 at 18:31

score 2 · Answer 2 · answered Jul 12 '12 at 15:17

2

There is nothing wrong here. You are doing different tests, since pairwise.t.test makes a correction to the p-value - to adjust for the fact that your are making multiple comparisons.

(Simply put, if you are making multiple comparisons, you are increasing the chances of finding spurious results. A correction adjusts for this.)

The help for ?pairwise.t.test will point you the ?p.adjust, where you can find more detail.

(Or you can read that font of infallible wisdom: http://en.wikipedia.org/wiki/Multiple_comparisons)

answered Jul 12 '12 at 15:17

Andrie

176,377
47
447
496

4

I was aware of that. But I thought that calling `pairwise.t.test` with the option `p.adjust.method="none"` would precisely suppress any corrections. So "none" does not mean "none" here? If I apply `p.adjust` with `method="none"` to the p-values of my `t.test` results "none" really means "none": nothing is changed. – absurd Jul 12 '12 at 16:55

score 1 · Answer 3 · answered Jul 12 '12 at 18:27

1

You need a oneway ANOVA with a multiple comparison procedure following a significant result. Additionally your data likey has no pairing to it; such as pre-test, post-test measurements within a single person, with the data being paired within each person.

answered Jul 12 '12 at 18:27

user1521694

11
1

Sorry, I don't understand what you mean. Isn't a oneway anova comparing two classes (e.g. ending vs second) the same as doing a t-test? The data actually are words of a certain grammatical class (class) that differ in usage frequency (freq). The frequency measures are from a prebuilt corpus. – absurd Jul 12 '12 at 18:40
yes you are correct, an ANOVA for two groups is the same as a pooled two-sample t-test. It looks as though you're doing all pairwise comparisons of t-test, which can lead to an inflation of the error rate of the test, but this might not matter to your application. To control the error rate an ANOVA is performed, then if significant a multiple comparison procedure such as Tukey Least Significant Difference is used to compare pairs of means. The paired t-test should only be used if there is a natural pairing to the data points, an extension of the paired t-test would be an ANOVA with blocks – user1521694 Jul 12 '12 at 19:02

R: t.test and pairwise.t.test give different results?

3 Answers3