Meaning of coef in limma

Question

tl;dr

why isn't fit <- eBayes(fit); topTable(fit, coef=4) the same as fit <- contrasts.fit(fit, c(-1,0,0,1)); fit <- eBayes(fit); topTable(fit) (column 1 of the design beeing the intercept)?

Example from the limma usersguide

Strain <- factor(targets$Strain, levels=c("WT","Mu"))
Treatment <- factor(targets$Treatment, levels=c("U","S"))
design <- model.matrix(~Strain+Strain:Treatment)
colnames(design)
[1] "(Intercept)" "StrainMu" "StrainWT:TreatmentS" "StrainMu:TreatmentS"

The first term in the model formula is an effect for Strain. This introduces an intercept column to the design matrix, which estimates the average log-expression level for wild-type unstimulated cells, and a column for Strain which estimates the mutant vs wildtype dierence in the unstimulated state. The second term in the model formula represents the interaction between stimulation and strain. [...] It introduces a third and a fourth column to the design matrix which represent the effect of stimulation for wild-type and for mutant mice respectively [...].

fit <- lmFit(eset, design)
fit <- eBayes(fit)
topTable(fit, coef=3)
# will find those genes responding to stimulation in wild-type mice, and
topTable(fit, coef=4)
# will find those genes responding to stimulation in mutant mice

What I don't understand

If using coef is the same as looking at the difference between the 4th column of the design matrix and the intercept (i.e. the contrast between the fourth and first column), wouldn't we need to look at the contrast between the fourth and second column to get the genes responding to stimulation in mutant mice?

Of course I compared the results when using coef and when using contrasts. They differ but I do not understand why... Obviously it means that coef=4 does not mean "look at the difference between column 4 and the intercept", but what does it mean then?

I hope that the question is understandable. Many thanks in advance!

Russ Hyde · Accepted Answer · 2018-02-02T11:02:53.413

The design matrix is based on

targets <- data.frame(
    Strain = factor(c("WT", "WT", "MU", "MU", "MU"), levels = c("WT", "MU")),
    Treatment = factor(c("U", "S", "U", "S", "S"), levels = c("U", "S")))
design <- model.matrix(~ Strain + Strain:Treatment, data = targets)

> targets
##   Strain Treatment
## 1     WT         U
## 2     WT         S
## 3     MU         U
## 4     MU         S
## 5     MU         S

Each row of targets corresponds to an experimental sample. The design matrix looks like this:

##      (Intercept) StrainMU StrainWT:TreatmentS StrainMU:TreatmentS
##    1           1        0                   0                   0
##    2           1        0                   1                   0
##    3           1        1                   0                   0
##    4           1        1                   0                   1
##    5           1        1                   0                   1

Again, each row corresponds to an experimental sample. The columns of design correspond to coefficients that are fitted by limma and you can read off what combination of coefficients gives the model-fitted value for a given experimental group by comparing the rows of design with those of targets.

Looking at coef=4 effectively means that you're testing the null hypothesis that the fourth coefficient (that for StrainMu:TreatmentS) is zero - it isn't the same as comparing the value of the fourth coefficient against the value of the intercept coefficient.

Think in terms of the fitted value for each of the experimental classes. For a given gene,

if I was a wildtype, unstimulated mouse (as in the first row of design or targets), my fitted value would be: Intercept
if I was a wildtype, stimulated mouse (second row), my fitted value would be: Intercept + StrainWT:TreatmentS
if I was a mutant, unstimulated mouse (third row), my fitted value would be: Intercept + StrainMu
if I was a mutant, stimulated mouse (fourth and fifth rows), my fitted value would be: Intercept + StrainMu + StrainMU:TreatmentS

So the difference between the stimulated and unstimulated group within the mutant strain is:

(Intercept + StrainMU + StrainMU:TreatmentS) - (Intercept + StrainMU) = StrainMU:TreatmentS

... the coefficient corresponding to the 4th column in the design matrix

Hopefully that was helpful

ps, using coef=4 should give you the same result as using contrast = c(0, 0, 0, 1)

Meaning of coef in limma

tl;dr

Example from the limma usersguide

What I don't understand

1 Answers1