1

I have two cohorts so I did a linear regression seperately per cohort and used a for loop so that my coefficients have been saved per cohort. I now want to get a pooled estimate per SNP, but I have 53 SNPs so would prefer not having to type out all the coefficients by hand. Is there a way to make a for loop to use in the rma command from metafor?

So far I've come as far as thinking that it's probably easiest to merge my two coefficient files together. I've called this coeffs. The first column has the SNP names, the 2nd and 6th columns have the betas from cohort 1 and 2, respectively and columns 3 and 7 have the standard errors from the two cohorts.

So I want to make an item beta that includes my beta from cohort 1 and cohort 2 per for 1 SNP. Then the same idea with se. I then want to have an rma(beta,se) per SNP so I can export the results to excel.

So far I thought of doing the following (but it doesn't work)

output3 <- data.frame(matrix(nrow=84,ncol=3))
names(output3)=c("Pooled Estimate", "Pooled Std.Error", "P-value")

for(l in 3:84){ 
    beta <-c(output3[l,2], output3[l,6])
    se <-c(output3[l,3], output3[l,7])
    pool <- rma(beta,se)
}

When I run the rma I get the following error message:

Error in [[<-.data.frame(*tmp*, l, value = list(b = -0.105507438518734, : replacement has 70 rows, data has 84

If I change nrow to 70, then I don't get the information. From the rma output I want the second row and columns 1,2 and 4. I think this is going wrong somewhere.

zx8754
  • 52,746
  • 12
  • 114
  • 209
  • Careful: The second argument of the ``rma()`` function is for the **sampling variances**, not the standard errors. So, you should use ``rma(beta, sei=se)``. Or you could do ``rma(beta, se^2)``. – Wolfgang Jan 28 '16 at 18:27
  • SNP is a genetic variant Thanks Wolfgang I didn't realise that! – Claire 'Scotty' Bear Jan 29 '16 at 07:43
  • 1
    [How to make a great R reproducible example?](http://stackoverflow.com/questions/5963269) – zx8754 Jan 29 '16 at 08:57

2 Answers2

1

I figured out my mistake, my problem was I forgot to tell R what lines of data I needed and where it needed to be saved to.

For anyone else with this problem, here is my script which worked. I first created my data.frame where I wanted the data to be saved.

output3 <- data.frame(matrix(nrow=84,ncol=3))
names(output3)=c("Pooled Estimate", "Pooled Std.Error", "P-value")

Next I made a for loop to extract the betas and s.e's from each SNP

for(l in 3:84){ 
  beta <-c(coeffs[l,2], coeffs[l,6])
  se <-c(coeffs[l,3], coeffs[l,7])
  pool <- rma(beta,se^2)
  z3 <- colnames(qcwomenc[1:84])
  row.names(output3)<-z3
  output3[l,1]<-coef(summary(pool))[1,1]
  output3[l,2]<-coef(summary(pool))[1,2]
  output3[l,3]<-coef(summary(pool))[1,4]
}
zx8754
  • 52,746
  • 12
  • 114
  • 209
  • 1
    You could also save `result.list[l] <- coef(summary(pool))[1, c(1, 2, 4)]` and then do `do.call("rbind", result.list)` outside the loop. Another option would be `output3[l, c(1, 2, 3)] <- coef(summary(pool))[1, c(1, 2, 4)]`. – Roman Luštrik Jan 29 '16 at 09:35
0

You say it is not working but you don't say what exactly. I think what you are missing is something to save your results. Like a list or data.frame. The variable pool gets updated on each iteration of the loop so after the loop is through it will contain only the last model. Also, your indices do not match the example data.frame as you are referring to column 6 and 7 which do not exist. But I guess they do exist in your actual data.frame. Also your example data.frame is full of NA values. Maybe try like this:

output3 <- data.frame(matrix(runif(84*4), nrow=84,ncol=4))
names(output3)=c("se1", "beta1", "se2", "beta2")

modellist <- list()
for(l in 3:84){ 
    beta <-c(output3[l,2], output3[l,4])
    se <-c(output3[l,1], output3[l,3])
    pool <- sum(beta, se)
    modellist[[l]] <- pool
}
modellist

Note, I used the sum instead your rma() because I don't know this function and I don't know what package it's from.

vanao veneri
  • 970
  • 2
  • 12
  • 31
  • Thanks @vanao Veneri but unfortuantely that doesn't work. I updated my post with a better description of the problem. I think I'm extracting the rma output incorrectly. I want per SNP the estimate, standard error and p value (so columns 1,2 and 4) to be saved... at the moment I'm not sure what I'm saving but it's nothing relevant and all going into the third column of output3 – Claire 'Scotty' Bear Jan 29 '16 at 08:48