1

I don't ask questions here very often because I usually try to solve it on my own (using many many stackoverflow threads). However, I am stuck now and I would appreciate some help.

The aim:

I would like to perform a mvabund analysis using a taxon 1) abundance table (abundance):

ID  GoodBacteria    BadBacteria UnknownBacteria SomeBacteria
Dog 0   7337    101 0
Cat 0   4178.5  0   0
Horse   2   4294.333333 35.66666667 0
Snail   0   4350    27.5    0
Bird    0.5 4332.5  46  0
Whale   0.666666667 4809.666667 13.66666667 0
Fish    1   1522    29  0
Human   0   4679.4  28.46666667 0.033333333

and 2) an environmental parameter file (factors):

ID  Mutualistic Commensalistic  Parasitic
Dog YES NO  NO
Cat YES YES YES
Horse   NO  NO  NO
Snail   NO  NO  YES
Bird    YES YES NO
Whale   YES NO  YES
Fish    NO  NO  NO
Human   YES NO  NO

However, the original environmental parameter file contains >80 factors and we would like to test them all separately.

Typically the mvabund analysis is split into three functions:

library(mvabund)    
mva <- mvabund(abundance)
mod <- manyglm(mva ~ factor(factors$Mutualistic), family="negative.binominal")
aov <- anova(mod, p.uni="adjusted")

And the aov output would look like this:

Multivariate test:
        Res.Df  Df.diff Dev Pr(>Dev)
(Intercept) 7                       
factor(factors$Salinity)    6   1   7.983   0.101

Univariate Tests:
        GoodBacteria    BadBacteria UnknownBacteria SomeBacteria       
        Dev Pr(>Dev)    Dev Pr(>Dev)    Dev Pr(>Dev)    Dev Pr(>Dev)
(Intercept)
factor(factors$Salinity)    5.489 0.090 2.41 0.259  0.085 0.770 0 1.000   

First solution:

My first goal was to perform all three tests (i.e., Mutualistic, Commensalistic, and Parasitic) column by column by using a loop or something similar. This is something I don't have much experience with, but I solved it by following this thread: Column wise granger's causal tests in R

This is the sapply function I adopted and it works:

sapply(1:ncol(factors), function(i) {
  m1 <- anova((manyglm(mva ~ factor(factors[,i]), family="negative.binominal")), p.uni="adjusted")
  list(multip=m1$table[c(3,4)])
  })

It performs the anova.manyglm for each column and I can even extract the multivariate p values for each column - not perfect, but it works:

$multip
                          Dev Pr(>Dev)
(Intercept)                NA       NA
factor(factors[, i]) 7.983104    0.112

$multip
                          Dev Pr(>Dev)
(Intercept)                NA       NA
factor(factors[, i]) 1.862846    0.702

$multip
                          Dev Pr(>Dev)
(Intercept)                NA       NA
factor(factors[, i]) 5.655806    0.228

The problem:

However, I would also like to get the univariate results for each species and each factor. And this is where I struggle at the moment.

The anova output is structured like this:

    Length Class      Mode     
    family       1      -none-     character
    p.uni        1      -none-     character
    test         1      -none-     character
    cor.type     1      -none-     character
    resamp       1      -none-     character
    nBoot        1      -none-     numeric  
    shrink.param 2      -none-     numeric  
    n.bootsdone  1      -none-     numeric  
    table        4      data.frame list     
    uni.p        8      -none-     numeric  
    uni.test     8      -none-     numeric 

uni.p contains the p values and species names (e.g., GoodBacteria) and uni.test the Dev values and species names. But I still don't understand how I can extract these values together with the sapply function above in order to store everything in one output or dataframe.

Any help would be very much appreciated.

Update

I changed the script a bit

    sapply(1:ncol(factors), function(i) {
      m1 <- anova((manyglm(mva ~ factor(factors[,i]), family="negative.binominal")), p.uni="adjusted")
      unlist(data.frame(dev_p=m1$table[c(3,4)], uni_p=m1$uni.p, uni_dev=m1$uni.test))
    })

The output looks like this now, it is not perfect, but it is the right direction:

                              [,1]       [,2]         [,3]
d.Dev1                          NA         NA           NA
d.Dev2                  7.98310400 1.86284590 5.6558056350
d.Pr..Dev.1                     NA         NA           NA
d.Pr..Dev.2             0.08200000 0.64200000 0.2260000000
uni.GoodBacteria1               NA         NA           NA
uni.GoodBacteria2       0.08600000 0.62000000 0.2850000000
uni.BadBacteria1                NA         NA           NA
uni.BadBacteria2        0.25500000 0.88200000 0.9900000000
uni.UnknownBacteria1            NA         NA           NA
uni.UnknownBacteria2    0.78500000 0.78800000 0.2440000000
uni.SomeBacteria1               NA         NA           NA
uni.SomeBacteria2       1.00000000 1.00000000 1.0000000000
unidev.GoodBacteria1            NA         NA           NA
unidev.GoodBacteria2    5.48864799 1.44833012 2.4419358477
unidev.BadBacteria1             NA         NA           NA
unidev.BadBacteria2     2.40968141 0.03306757 0.0001129865
unidev.UnknownBacteria1         NA         NA           NA
unidev.UnknownBacteria2 0.08477459 0.38144821 3.2137568008
unidev.SomeBacteria1            NA         NA           NA
unidev.SomeBacteria2    0.00000000 0.00000000 0.0000000000
plik
  • 67
  • 9

0 Answers0