0

I am trying to use a for loop to fit a model to each of my metabolites. There are 790 columns I need to apply this to. The output of the results are three values ( estimate, std.error and p value), and my empty matrix has 790 rows and three columns for the results to be entered into. (Therefore, I aim to get an estimate, std.error and p value for each of the 790 metabolites, in order to be able to gauge whether there are increases or decreases comparing control relative to disease, and whether there is any statistical significance).

Please find below the code I have tried so far, any suggestions would be greatly appreciated.

  results.out <- matrix(0, nrow=790, ncol=3)

  require(lme4)
   require(lmerTest)

  for(i in 1:ncol(data[, 8:792])){
  fit <- lmer (data1[,i] ~ Diseasestatus + BB + ACOG + WA + BMI + Age + (1|ParticpantID), data=data, REML=F, na.action=na.omit)
  results<- summary(fit)$coef[2, c(1, 2, 5)][i]
  results.out[, i] <- results
  }

The error message I get with the above is

      Error in eval_f(x, ...) : Downdated VtV is not positive definite

( not sure whether this might be due to the presence of 0 and 1 in some columns. For instance disease is 1 and control is 0. Taking the medication BB, ACOG, WA is denoted 1, not taking it is 0.

or trying one of the apply functions also gets an error

 output <- apply(data[,8:792], 2, function(i){
 fit <- lmer (data[,i] ~ Diseasestatus + BB + ACOG + WA + BMI + Age + (1|ParticpantID), data=data, REML=F, na.action=na.omit)
  results<- summary(fit)$coef[2, c(1, 2, 5)][i]
  })

 dplyr::bind_rows(output, .id="Metabolite")

The error message from the above is;

  Error in model.frame.default(data = data, na.action = 
  na.omit,  : 
  invalid type (list) for variable 'data[, i]' 

A snapshot of my data in case that is useful can be found below;

    structure(list(Participant_ID = c(34L, 35L, 119L, 157L, 158L, 
    208L, 209L, 1364L, 1365L, 127911L, 127912L, 154110L, 154120L, 
    167113L, 167123L, 171713L, 171724L, 184212L, 184213L), BB = c(0L, 
    0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 0L, 0L, 1L, 1L, 0L, 
    0L, 1L), ACOG = c(1L, 1L, 1L, 1L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 
    1L, 1L, 1L, 1L, 1L, 0L, 0L, 1L), WA = c(0L, 0L, 0L, 0L, 0L, 0L, 
    0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L), BMI = 
    c(23.94688606, 
   25.87052536, 26.38413048, 24.10971069, 27.77280045, 24.93728065, 
   26.8804493, 23.90113258, 25.07429123, 27.60118484, 23.12600708, 
   26.39195442, 23.01516533, 31.3666172, 31.80447578, 24.03654861, 
   25.11828613, 24.17065239, 28.48728561), Age = c(76L, 76L, 68L, 
  68L, 68L, 57L, 57L, 56L, 56L, 60L, 60L, 44L, 44L, 58L, 58L, 71L, 
  71L, 56L, 56L), Diseasestatus = c(0L, 0L, 1L, 1L, 0L, 0L, 1L, 
  1L, 0L, 0L, 1L, 1L, 0L, 0L, 1L, 1L, 0L, 0L, 1L), Met1 = c(0.326537646, 
  0.362137501, 0.403331692, 0.343789581, 0.437786804, 0.720648545, 
  0.974583105, 0.565800103, 0.613001417, 0.547743467, 0.337683125, 
  0.393250468, 0.465795971, 0.390206584, 0.172261362, 0.382496277, 
  0.435237338, 0.945312001, 0.321214419), Met2 = c(0.465736593, 
  0.540715637, 0.472693123, 0.681156674, 0.416291697, 0.487306504, 
  0.499092007, 0.634904337, 0.408109505, 0.808546214, 0.4113336, 
  0.924069141, 0.673204104, 0.693500596, 0.522794352, 0.373602067, 
  0.716407827, 0.649634492, 0.514429127), Met3 = c(0.902854296, 
  0.413241218, 0.418436978, 0.599698582, 0.806269489, 0.746859677, 
  0.461750237, 0.534943022, 0.511101841, 0.339406025, 0.235624644, 
  0.405761674, 0.312947287, 0.409833325, 0.026137354, 0.477175654, 
  0.387610389, 0.226427797, 0.19742037), Met4 = c(0.99425024, 0.923934731, 
  0.804677487, 0.31081605, 0.351561982, 0.529615606, 0.756342125, 
  0.968115646, 0.989016517, 0.938703504, 0.841777433, 0.103150219, 
  0.68397041, 0.903129097, 0.897388285, 0.905293975, 0.992337012, 
  0.358619626, 0.159601445), Met5 = c(0.527268407, 0.646332723, 
  0.646042578, 0.163344212, 0.202267074, 0.536976636, 0.789061409, 
  0.725657854, 0.697350164, 0.044081822, 0.959496477, 0.295039796, 
  0.120109301, 0.160817478, 0.901107461, 0.529179518, 0.573373775, 
  0.560701172, 0.325806613), Met6 = c(0.809497068, 0.614253411, 
  0.375421856, 0.446069992, 0.710859888, 0.474587655, 0.217817798, 
  0.464787031, 0.5540375, 0.62822217, 0.082906217, 0.294754096, 
  0.862216149, 0.427856328, 0.418944666, 0.516181576, 0.544516281, 
  0.519113772, 0.279522811), Met7 = c(0.419627992, 0.365954584, 
  0.434398151, 0.313441811, 0.368051981, 0.660614914, 0.825809828, 
  0.412109302, 0.545740249, 0.326247449, 0.373035298, 0.380623499, 
  0.428859232, 0.321044089, 0.24939936, 0.298372835, 0.387467105, 
  0.906034877, 0.147250125), Met8 = c(0.549683979, 0.347795497, 
  0.465729386, 0.625045713, 0.551784129, 0.348174756, 0.4334509, 
  0.594903245, 0.561353241, 0.621274979, 0.231389704, 0.308801446, 
  0.464799907, 0.401663011, 0.332966555, 0.109698561, 0.184359915, 
  0.091447702, 0.20568595), Met9 = c(0.605266628, 0.316564583, 
  0.166558136, 0.337470002, 0.458328756, 0.409329111, 0.269424154, 
  0.514746553, 0.408357879, 0.572246814, 0.264718681, 0.125162297, 
  0.211230627, 0.655667116, 0.034006203, 0.189685846, 0.243832622, 
  0.360657636, 0.259174139), Met10 = c(0.576174353, 0.214361265, 
  0.523133504, 0.549085457, 0.430400583, 0.53943429, 0.441563681, 
  0.401805576, 0.386025835, 0.514017513, 0, 0.330305736, 0.567380079, 
  0.50505895, 0.242814909, 0.306522744, 0.132950297, 0.207312191, 
  0.328760686)), class = "data.frame", row.names = c(NA, -19L))
jaria20
  • 61
  • 7
  • It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. We don't need all your real data, just something simple to actually run the code. Be sure to write the exact error messages you get in the question so we have some idea what's going on. – MrFlick Mar 31 '20 at 16:01
  • @MrFlick Please find above an edited post which hopefully provides a bit more information. I will add an answer if I can work this out. Thank-you – jaria20 Mar 31 '20 at 16:11
  • 1
    It looks like you have a problem with your apply statement. Given your apply statement `apply(data[,8:792], 2, function(i) {` then "i" is the column so your model should be `lmer (i ~ Diseasestatus +. . .` – Dave2e Mar 31 '20 at 17:07
  • @Dave2e Thank-you it now runs and is able to get the output I need! I do not need the dplyr bit at the end. If you're happy to, if you add this as an answer, I can select it as the correct one. All the best – jaria20 Mar 31 '20 at 19:01

0 Answers0