2

I have produced the following plot using ggplot2. As you see I have 3 different classes colored as red black blue. I would like plot two curves on the two boundary that separate red points from black point and blue points from black points. Any ideas I am completely lost.

My code is:

datax=data.frame(x=y_data,y=x_data,
                 Diff_Motif_XY=factor(diff_motif,levels=c(1,0,‌​-1)),
                 size=factor(abs(diff_motif))) 
#
p=ggplot(datax,aes(x,y))+ 
     geom_point(aes(colour = Diff_Motif_XY,size=size))+ 
     xlab(cond2)+ 
     ylab(cond1)+ 
     scale_colour_manual(values=c("red","black","blue"))

Please click for the image

tonytonov
  • 25,060
  • 16
  • 82
  • 98
John1402
  • 67
  • 4
  • 1
    Can you post a code sample? It sounds like you have a machine learning problem. – David Maust Dec 17 '15 at 23:21
  • 1
    Kind of depends how you want your answer (your curve) to look like as a data structure. One way would be to construct a classifier (using something like an SVM or something else), and then plot the 50% isobars. That would be fairly straightforward I think. I can think of a couple simpler possiblities too, but they are less general. – Mike Wise Dec 17 '15 at 23:26
  • The curve I am looking for should not be linear, and preferably I prefer an easy if there is one. My code is: datax=data.frame(x=y_data,y=x_data,Diff_Motif_XY=factor(diff_motif,levels=c(1,0,-1)),size=factor(abs(diff_motif))) p=ggplot(datax,aes(x,y))+ geom_point(aes(colour = Diff_Motif_XY,size=size))+ xlab(cond2)+ ylab(cond1)+ scale_colour_manual(values=c("red","black","blue")) – John1402 Dec 17 '15 at 23:31
  • 2
    Please add that to your question. You can edit them you know. – Mike Wise Dec 17 '15 at 23:39
  • 1
    Please provide a [reproducible example](http://stackoverflow.com/a/5963610/1412059). – Roland Dec 18 '15 at 08:06
  • Somewhat similar task discussed [here](http://stackoverflow.com/questions/31893423/r-plotting-posterior-classification-probabilities-of-a-linear-discriminant-anal). – tonytonov Dec 18 '15 at 08:53
  • I don't think the question is very specific. To start with I can only see red and black on the plot. The problem as I see or is down to defining the boundary of change compared to expectation. – Kharoof Dec 19 '15 at 00:56
  • Thanks for all. I found the solution. The solution is to draw the convex hull of the red and blue points. One can use this discussion for this purpose: http://stats.stackexchange.com/questions/22805/how-to-draw-neat-polygons-around-scatterplot-regions-in-ggplot2 – John1402 Dec 21 '15 at 16:49

1 Answers1

1

I got (way too) curious. I think it looks like the boundary is a hyperbola. One could calculate the optimal bounding hyperbola using something like optim, but it would be a fair amount of work and it might not converge.

# Generate some data because the OP did not provide any

npts <- 30000
l_data <- pmax(0,runif(npts,-10,20))
s_data <- (20-l_data + 10)/6
xstar <- -5.1
ystar <- -5.1
x_data <- pmax(0,l_data + rnorm(npts,0,s_data)) + xstar
y_data <- pmax(0,l_data + rnorm(npts,0,s_data)) + ystar
ha <- 6.0
hb <- 6.0

xy2 <- ((x_data-xstar)/ha)^2 - ((y_data-ystar)/hb)^2 + 0.8*rnorm(npts)


diff_motif <- ifelse(xy2>1,1,ifelse(-xy2<1,0,-1))
cond1 <- ""
cond2 <- ""


# We need this to plot our hyperbola
genhyperbola <- function( cx,cy,a,b,u0,u1,nu,swap=F)
{
  # Generate a hyperbola through the parametric representation
  #  which uses sinh and cosh 
  #  We generate nu segements from u0 to u1
  #  swap just swaps the x and y axes allowing for a north-south hyperbola (swap=T)
  #
  #  https://en.wikipedia.org/wiki/Hyperbola
  #
  u <- seq(u0,u1,length.out=nu)
  x <- a*cosh(u)
  y <- b*sinh(u)
  df <- data.frame(x=x,y=y)
  df$x <- df$x + cx
  df$y <- df$y + cy
  if (swap){
    # for north-south hyperbolas
    tmp <- df$x
    df$x <- df$y
    df$y <- tmp
  }
  return(df)
}
hyp1 <- genhyperbola(xstar,ystar, ha,hb, 0,2.2,100, swap=F)
hyp2 <- genhyperbola(xstar,ystar, ha,hb, 0,2.2,100, swap=T)

datax=data.frame(x=x_data,y=y_data,
                 Diff_Motif_XY=factor(diff_motif,levels=c(1,0,-1)),
                 size=0) 

eqlab1 <- sprintf("((x+%.1f)/%.1f)^{2}-((y+%.1f)/%.1f)^{2} == 1",xstar,ha,ystar,hb)
eqlab2 <- sprintf("((y+%.1f)/%.1f)^{2}-((x+%.1f)/%.1f)^{2} == 1",ystar,hb,xstar,ha)
#
p=ggplot(datax,aes(x,y))+ 
  geom_point(aes(colour = Diff_Motif_XY),shape=".")+ 
  geom_path(data=hyp1,aes(x,y),color=I("purple"),size=1)+
  geom_path(data=hyp2,aes(x,y),color=I("brown"),size=1)+
  xlab(cond2)+ 
  ylab(cond1)+ 
  scale_colour_manual(values=c("blue","black","red")) +
  annotate('text', x=xstar+20, y=ystar+2,  
           label = eqlab1,parse = TRUE,size=6,color="purple") +
  annotate('text', x=xstar+5,  y=ystar+20, 
           label = eqlab2,parse = TRUE,size=6,color="brown") 

print(p)

And here is the image:

enter image description here

Mike Wise
  • 22,131
  • 8
  • 81
  • 104
  • Sorry, there was a typo in there before. – Mike Wise Dec 21 '15 at 18:46
  • Hi Mike, Thanks for your input this is also a very interesting idea. I also developed a similar local convex hull idea. But I liked this also very much. thanks for your effort. – John1402 Dec 25 '15 at 17:40