I am fairly new to R, and have recently started using it to analyse some microarray data. The overall aim of the analysis is to take DC2 and compare the WT vs KO groups in this population. But I have come across some problems with the limma processing. After processing the data using the oligo package, I have then tried to create a design matrix for analysis using limma. This is my workflow for the ExpressionSet of DC2:
pData(DC2)
index filename genotype cell_type
1 KO DC2 2 HP10.CEL KO DC2
2 KO DC2 3 HP11.CEL KO DC2
3 KO DC2 4 HP12.CEL KO DC2
1 WT DC2 10 HP7.CEL WT DC2
2 WT DC2 11 HP8.CEL WT DC2
3 WT DC2 12 HP9.CEL WT DC2
design <- model.matrix(~DC2$genotype)
design
(Intercept) DC2$genotypeWT
1 1 0
2 1 0
3 1 0
4 1 1
5 1 1
6 1 1
fit <- lmFit(DC2, design)
fit <- eBayes(fit)
toptable(fit)
This feeds out a list of genes as follows:
logFC t P.Value adj.P.Val B
17551163 14.09722 208.2627 2.990326e-13 2.700912e-10 17.14467
17511316 13.91167 205.0811 3.292503e-13 2.700912e-10 17.12716
17551167 13.92093 204.5801 3.343243e-13 2.700912e-10 17.12434
17375373 13.76320 202.1271 3.605170e-13 2.700912e-10 17.11025
17550685 13.74022 201.5428 3.671032e-13 2.700912e-10 17.10682
However, when I go to check this manually (just taking the first feature) using this code:
toptable(fit, n=1)
genename <- rownames(toptable(fit, n=1))
typeMean <- tapply(exprs(DC2)[genename,], DC2$genotype, mean)
typeMean["KO"] - typeMean["WT"]
The output for the same feature "17551163" is different
KO
0.04538317
I have tried to search around for an answer, but with no luck. I'm assuming that this may be something to do with the matrix design? Any help would be greatly appreciated.
Thanks