0

I am trying to create a line plot connecting the top points of a Manhattan plot (showing only the 'skyline' of the manhattan plot). I think I can manually select the points to connect and specify them with the 'line' command, but I was wondering if there is an easier way to do this in R?

This is an example dataset and plot:

affy <-c(40220, 41400, 33801, 32334, 32056, 31470, 25835, 27457, 22864, 28501, 26273, 24954, 19188, 15721, 14356, 15309, 11281, 14881, 6399, 12400, 7125, 6207)
CM <- cumsum(affy)
n.markers <- sum(affy)
n.chr <- length(affy)
test <- data.frame(chr=rep(1:n.chr,affy),pos=1:n.markers,p=runif(n.markers))
plot(test$pos, -log10(test$p))

Thank you in advance for the help!

user971102
  • 3,005
  • 4
  • 30
  • 37
  • How do you define the top points? Is there a standard way of doing this? I think you'll have to create a partition of your x-interval, and then select the top points within each partition. – Ferdinand.kraft Apr 07 '13 at 13:19
  • The top points are the highest unique x values at each y value… I guess I can just select those and draw a line through them, just wondering if there was a better way to do this since I thought this procedure was done frequently. – user971102 Apr 07 '13 at 13:37
  • I think you mean the highest *y* values at each *x*. The problem is that your points do not share x-coordinates. You need to group then, using a partition for instance. – Ferdinand.kraft Apr 07 '13 at 13:41
  • I think if you search for "frontier" you will find solutions to the problem that @Ferdinand.kraft has quite rightly pointed out. – Ari B. Friedman Apr 07 '13 at 14:09
  • 2
    [Is this](http://stackoverflow.com/questions/9106401/select-rows-of-a-data-frame-based-on-column-properties/9106581#9106581) what you're looking for? – Josh O'Brien Apr 07 '13 at 14:42
  • Thank you all for your help, and thanks for correcting me, what I would like is indeed selecting the highest y values at each x and connecting these points, to show the trend of an association study and where it peaks. This would help show schematically whether it is bimodal or how much the peak region extends. I see it is not as easy as I though…I'll look into 'frontier'… – user971102 Apr 07 '13 at 15:05
  • Thank you Josh O'Brien, I think the skyline query is what I am looking for! – user971102 Apr 07 '13 at 15:27
  • Great. Then I'm going to suggest we close this as a duplicate (which will more officially link the two questions.) – Josh O'Brien Apr 07 '13 at 16:13

0 Answers0