8

I have a cell array of p-values that have to be adjusted for multiple comparisons. How can I do that in Matlab? I can't find a built-in function.

In R I would do:

data.pValue_adjusted = p.adjust(data.pValue, method='bonferroni')

Is there a similiar function for Matlab? Ideally one that performs different adjustment methods (Bonferroni, Benjamini-Hochberg, FDR ...)?

JayPeerachai
  • 3,499
  • 3
  • 14
  • 29
Martin Preusse
  • 9,151
  • 12
  • 48
  • 80

6 Answers6

5

If you have Bioinformatics Toolbox, you can use MAFDR function to calculate p-values adjusted by False Discovery Rate.

yuk
  • 19,098
  • 13
  • 68
  • 99
  • 1
    For those without access to the Bioinformatics Toolbox there is an implementation of both Benjamini & Hochberg and Benjamini & Yekutieli at [matlab central](http://www.mathworks.com/matlabcentral/fileexchange/27418-benjamini-hochbergyekutieli-procedure-for-controlling-false-discovery-rate). Though not a drop-in replacement, it can be used instead of MAFDR. – Eponymous Apr 02 '14 at 01:38
  • 1
    @Eponymous It's worth noting, though, that the official `mafdr` implementation by default uses the method of Storey (2002), which is generally more powerful than the original Benjamini-Hochberg version. Documentation [here](http://www.mathworks.com/help/bioinfo/ref/mafdr.html). – Matt Jan 27 '15 at 03:05
1

For people without the Bioinformatics Toolbox, the FDR (False Discovery Rate) method is also very nicely described here, it also provides a link with an fdr script.

Fraukje
  • 673
  • 3
  • 20
1

A MATLAB/Octave implementation of R's p.adjust function is available here. It can perform p-value adjustment for multiple comparisons with the following methods, equivalent to their R counterparts:

  • Holm
  • Hochberg
  • Hommel
  • Bonferroni
  • BH
  • BY
  • fdr
  • Sidak (this one is not available in the R function)

Disclaimer: I'm the author of this package.

faken
  • 6,572
  • 4
  • 27
  • 28
0

See also http://uk.mathworks.com/matlabcentral/fileexchange/28303-bonferroni-holm-correction-for-multiple-comparisons and https://en.wikipedia.org/wiki/Holm%E2%80%93Bonferroni_method for background.

Jan Schnupp
  • 91
  • 1
  • 4
  • While the links may answer the question, it is better to include the essential parts of the answer here and provide links for reference. Link-only answers can become invalid if the linked page changes. – ZygD Jul 15 '15 at 16:06
0

Have a look at T-test with Bonferroni Correction and related files in the Matlab File-exchange.

I hope this helps.

ephsmith
  • 398
  • 3
  • 12
0

This submission is probably what you are looking for, but it only implements the Bonferroni-Holm method. You would have to search the FEX for similar solutions to the other correction methods..

That said the Statistics Toolbox has the MULTCOMPARE method which is designed for multiple comparison tests, though it does not return the corrected p-values. Here is an example:

load fisheriris
[pVal tbl stats] = kruskalwallis(meas(:,1), species)   %# Kruskal-Wallis or ANOVA
title('Sepal Length'), xlabel('Groups'), ylabel('Value')

[c,m] = multcompare(stats, 'ctype','bonferroni', 'display','on');
Amro
  • 123,847
  • 25
  • 243
  • 454
  • Bonferroni-Holm is good enough I think. I don't understand the details anyway ;) – Martin Preusse Sep 06 '11 at 18:02
  • A quick read of the documentation suggests that multompare is only for anova-like measures (it seems to use t-test critical values - see the description of bonferroni - rather than adjusting p values) – Eponymous Apr 01 '14 at 14:36
  • 2
    There is an important difference between Bonferroni* (FWER/Family-wise-error rate) and Benjamini* (FDR/False discovery rate). Very very roughly, the significance (aka alpha) in FWER is the probability that the test incorrectly rejects the null hypothesis even once. In FDR, the significance is the proportion of the rejections (discoveries) that are incorrect. So, if 40 are significant at a 95% level, under FWER that means that there is a 1 in 20 chance that 1 match is wrong. Whereas for FDR it means 1/20 are wrong - so on average 2 are wrong. – Eponymous Apr 01 '14 at 14:55
  • 1
    @Eponymous: Thanks for the explanation. I think you are right about `multcompare`, seeing that its input is the `stats` structure which only contains the t-values not the p-values... As others have shown below, the solution is to use `ttest2` to compute the raw p-values from multiple tests, then call `mafdr` from the Bioinformatics toolbox to get the adjusted p-values according to the Benjamini & Hochberg (FDR) method. I probably should have mentioned that I am not a statistician and this is not my area of expertise :) – Amro Apr 01 '14 at 17:01