1

I'm trying to create a manhattan plot in linux. This is my first time doing so using qqman and I am stuck on this error. Here is my R code:

library(data.table)
library(qqman)
data = fread("/z/Comp/lu_group/Members/jwlorge/ATN/scripts/data/temp_gwas/output_1/factor_1.txt", fill=TRUE, header=TRUE)
data$CHR = as.numeric(data$CHR)
data$BP = as.numeric(data$BP)
data$Pval_Estimate = as.numeric(data$Pval_Estimate)
jpeg('corrplot1.jpg')
manhattan(data,chr="CHR", bp="BP", snp="SNP", p="Pval_Estimate")
dev.off()

jpeg('qqplot1.jpg')
qq(data$Pval_Estimate)
dev.off()

Which gives me the following error:

Error in rep.int(seq_along(unique(d$CHR)), times = tapply(d$SNP, d$CHR,  :
  invalid 'times' value
Calls: manhattan

I searched the error on the internet and I didn't find it anywhere. I am completely stumped. Does anyone know what it means? Thanks.

Johnny
  • 59
  • 5
  • 1
    What is the variable `d` ? In your example code, you use `data`. It looks like there is an issue with the values in the CHR or SNP columns. Check that they are numeric, and that there are no negative or missing values. – neilfws Sep 06 '22 at 00:12
  • @neilfws I am also getting an error that there are NAs introduced by coercion. Would that cause this? I made sure CHR and SNP are numeric. – Johnny Sep 06 '22 at 00:17
  • Yes, I think NA values will create the 'times' error. You can check using _e.g._ `any(is.na(data$CHR))`. – neilfws Sep 06 '22 at 00:24

1 Answers1

0

If you have X, Y or MT chromosomes in your "CHR" column the function returns an error. You can 'fix' it by converting them to a factor, then to a number, e.g.

library(qqman)
#> 
#> For example usage please run: vignette('qqman')
#> 
#> Citation appreciated but not required:
#> Turner, (2018). qqman: an R package for visualizing GWAS results using Q-Q and manhattan plots. Journal of Open Source Software, 3(25), 731, https://doi.org/10.21105/joss.00731.
#> 

createSampleGwasData<-function(chr.count=10, include.X=T) {
  CHR<-c(); POS<-c()
  for(i in 1:chr.count) {
    CHR <- c(CHR,rep(i, 1000))
    POS <- c(POS,ceiling(runif(1000)*(chr.count-i+1)*25*1e3))
  }
  if(include.X) {
    CHR <- c(CHR,rep("X", 1000))
    POS <- c(POS,ceiling(runif(1000)*5*25*1e3))
  }
  P <- runif(length(POS))
  return(data.frame(CHR, POS, P, SNP = "rsid"))
}
dd<-createSampleGwasData()

manhattan(dd, chr="CHR", bp="POS", snp="SNP", p="P")
#> Error in manhattan(dd, chr = "CHR", bp = "POS", snp = "SNP", p = "P"): CHR column should be numeric. Do you have 'X', 'Y', 'MT', etc? If so change to numbers and try again.

dd$CHR <- as.numeric(as.factor(dd$CHR))
manhattan(dd, chr="CHR", bp="POS", snp="SNP", p="P")

Created on 2022-09-06 by the reprex package (v2.0.1)

jared_mamrot
  • 22,354
  • 4
  • 21
  • 46