1

I want to plot SNPs-based and k-mers-based gwas results in one Manhattan plot. I can highlight the k-mers-based associations in Manhattan plot. But at some locations, both are overlapping to each other and one type is hidden under another. If I can give a different cex for only highlighted k-mers points I can easily see if SNPs-based are also present beneath them. Codes I am using:

if( !require(fastman) ){devtools::install_github('kaustubhad/fastman',build_vignettes = TRUE); library(fastman)} else 
        library(fastman)
library(data.table)
# Here is example data set
leafcd_All <- setDT( 
structure(list(chr = c(2L, 6L, 8L, 7L, 6L, 5L, 8L, 4L, 1L, 2L, 
                       1L, 4L, 4L, 6L, 5L, 2L, 2L, 7L, 1L, 4L), 
     ps = c(20812578L, 21922458L, 17725620L, 13743808L, 5562625L, 10048263L, 5509174L, 8047708L,             16788431L, 1210943L, 23409224L, 16606297L, 16789190L, 517233L,             14615216L, 20573133L, 13059817L, 12658352L, 19534214L, 10930609L), 
    rs = c("TGTAAGGTTGTTCCTAGAAATAATTGGCAAA_2153", "6_21922458", 
                                 "8_17725620", "7_13743808", "6_5562625", "5_10048263", "8_5509174", "4_8047708", "1_16788431", "2_1210943", "1_23409224", "4_16606297",  "4_16789190", "6_517233", "5_14615216", "2_20573133", "2_13059817",  "7_12658352", "1_19534214", "4_10930609"), 
    af = c(0.515, 0.20052599941513,-0.0991540413866392, 0.249140233643853, -0.258459039498031, -0.082762873887825,                                                                    0.208875049809843, 0.142353227958131, 0.238243182747475, -0.0952385144332126,                                                                   0.14830584593916, 0.087113892379901, 0.0773651947896911, -0.162990586405916,  0.169921912690918, 0.0709804692588175, -0.0713894932748928, 0.120534157624669, 0.0635756437123949, 0.0660060899435789),
 p_value = c(7.824069e-12, 0.00042769577040115, 0.00272540332860084, 0.00479242286846797, 
                           57412965, 0.0085338001721863,                                                                                 0.00959456164547606, 0.00966518049928214, 0.00993123682688346,                                                                               0.0117703417292982, 0.0149086053151355, 0.0150152941107474, 0.0174181286161663,                                                                                                                          0.0180474657048135, 0.0185344464537632, 0.0214819012476155, 0.0241552335174933,                                                                                                                          0.0247393426120147, 0.0327745679439906)), 
row.names = c(NA, -20L ), class = c( "data.frame") ) )

leafcd_All<-fread('output2.txt')
highlight_snps <- scan("highlight.txt", what=character())
fastman(leafcd_All, snp = "rs", chr="chr", bp = "ps", p="p_value",
 ylab="-LogP_leaffe_boxcox_All", col = c("darkgreen", "mediumvioletred"), 
main=NULL,suggestiveline =FALSE, genomewideline=7.73, 
cex=0.8,cex.lab=0.8,highlight=highlight_snps)

But I don't have an idea how to set a different point size for highlighted SNP above.

IRTFM
  • 258,963
  • 21
  • 364
  • 487
  • From the documentation: **cex** = A numerical **vector** giving the amount by which plotting characters and symbols should be scaled relative to the default. (https://github.com/kaustubhad/fastman ). Thus: supply a vector the length of your point count with values specifying label expansion. For more specific replies, please provide a reproducible example: https://stackoverflow.com/help/minimal-reproducible-example – I_O Jun 18 '23 at 14:21
  • based on your example data, you can set `cex` (or other point-wise parameters like `col`) this way: `fastman(..., cex = ifelse(leafcd_All$rs %in% highlight_snps, 5, 1), ...)` which would set `cex` to 5 if rs is found among the snps to be highlighted and 1 if otherwise. – I_O Jun 18 '23 at 16:09
  • I tried it but still all the points are of same size, no difference of size in highlight_snps. – Vinod Kumar Jun 18 '23 at 20:02
  • You could do a quick check such as `any(leafcd_All$rs %in% highlight_snps)`. If this returns `FALSE`, none of the values of rs actually matches the values you want highlighted. – I_O Jun 18 '23 at 22:11
  • Still not reproducible even after I fixed all of the errors in code and formatting and posting. Need a `hightlight_snps` object. – IRTFM Jun 19 '23 at 00:14
  • Unfortunately you cannot use the dput output to post datatable objects and expect people to get them the work. the internal pointer is meaningless even if the data.table package is loaded since it refers to a memory location on your machine. It needs to be either read in with fread from a file-like stream or reconstructed locally with setDT from a dataframe. – IRTFM Jun 19 '23 at 00:21
  • @I_O any(leafcd_All$rs %in% highlight_snps) is TRUE. – Vinod Kumar Jun 19 '23 at 15:00
  • When I ran: `fastman(leafcd_All, snp = "rs", chr="chr", bp = "ps", p="p_value", ylab="-LogP_leaffe_boxcox_All", col = c("darkgreen", "mediumvioletred"), main=NULL,suggestiveline =FALSE, genomewideline=7.73, cex = ifelse(leafcd_All$rs %in% highlight_snps, 5, 1), cex.lab=0.8,highlight=highlight_snps)` with the data and `highlight_snps` as a sample from the values in rs I got what you requested. So suggesting that this Q be closed as not reproducible. – IRTFM Jun 19 '23 at 21:30

0 Answers0