2

I want to add a third data set to a scatterplot but it has values orders of magnitude larger than the other two data sets.

Is there a way to plot values from all three data sets in one window, signifying data set 3's y values on the right y-axis (axis 4) in a color matching the color I use for set 3's points? Here is what I've tried:

plot(xlab = "Mb", ylab = "Pi", 
      x, yAll,          pch = 6,  cex = .5, col = "blue", type ="b" )
lines(x, yAll_filtered, pch = 18, cex = .5, col = "red",  type = "b")

This gets me two of the three data sets plotted, then I don't know the next step.

Ideally I can plot set 3 values in green and have the differently scaled y values appear on the right, also in green. Basically, plotting these Y values with these parameters met

plot(x, yAll_normalized, pch = 19, cex = .5, col = "green", type = "b", 
     axis(4))
gung - Reinstate Monica
  • 11,583
  • 7
  • 60
  • 79
martin gordy
  • 95
  • 1
  • 2
  • 6
  • 1
    It would really help if you created a small reproducible example. Try creating a small dataset with y values of different magnitudes. Then use `dput` on the data, so we can just cut and paste it into our consoles. – nograpes Sep 18 '13 at 13:29
  • Duplicate? See e.g. http://stackoverflow.com/questions/3099219/how-to-use-ggplot2-make-plot-with-2-y-axes-one-y-axis-on-the-left-and-another – Henrik Sep 18 '13 at 15:02
  • 1
    That's a good catch, @Henrik, but I suspect ggplot2 is not what the OP is after; it's rather advanced & requires a different way of thinking, whereas I suspect the OP is new to R. – gung - Reinstate Monica Sep 18 '13 at 15:11

2 Answers2

1

Three simple options:

pracma::plotyy graphics::axis

Just scale your dataset before plotting.

If you go with number two, first plot your "left-hand" data the usual way, then call axis(side=4,{other setup arguments}) followed with lines(data3,...)

Edit - per gung's valid comment, here's part of the help file for plotyy :

plotyy(x1, y1, x2, y2, gridp = TRUE, box.col = "grey",
                       type = "l", lwd = 1, lty = 1,
                       xlab = "x", ylab = "y", main = "",
                       col.y1 = "navy", col.y2 = "maroon", ...)
Arguments

x1, x2  
x-coordinates for the curves

y1, y2  
the y-values, with ordinates y1 left, y2 right.

type    
type of the curves, line or points (for both data).

That will plot the x1,y1 with autoscaling on the left-y axis, and x2,y2 similarly on the right-y axis.

Carl Witthoft
  • 20,573
  • 9
  • 43
  • 73
1

(This question may be better migrated to stats.SE, because the issue isn't what function to call but understanding the ideas behind these things.)

The basic strategy here is to scale your dataset before plotting, as @Carl Witthoft notes. Here's how it works (to understand any of the functions used, enter ?<function name> at the prompt on your R console):

# here I generate some example data, set.seed makes it reproducible
set.seed(33)
x <- 1:20; y0 <- 20; y1 <- 25; y2 <- 300
for(i in 2:20){
  y0 <- c(y0, y0[i-1]+rnorm(1, mean=0.25, sd=1.5))
  y1 <- c(y1, y1[i-1]+rnorm(1, mean=0,    sd=1))
  y2 <- c(y2, y2[i-1]+rnorm(1, mean=-10,  sd=5))
}
max(y0, y1)  
# [1] 35.3668
min(y0, y1)
# [1] 17.77653
# from 0 to 50 seems like a reasonable Y range for the plotting area

windows()
  plot (x, y0, pch=6,  cex=.5, col="blue", type="b", 
        xlab="Mb", ylab="Pi", ylim=c(0, 50))
  lines(x, y1, pch=18, cex=.5, col="red",  type="b")

# We need to create a new variable that will fit within this plotting area
y2new <- scale(y2)        # this makes y2 have mean 0 & sd 1
y2new <- y2new*sd(y0)     # now its sd will equal that of y0
y2new <- y2new+mean(y0)   # now its mean will also equal that of y0

  lines(x, y2new, pch=24, cex=.5, col="green", type="b")

# now y2 fits within the window, but we need an axis which must map the 
#   plotted points to the original values

summary(y0)
#    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
#   17.78   20.64   24.34   25.62   30.25   35.37
summary(y2)
#    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
#   125.1   178.2   222.2   220.0   266.3   300.0
sd(y0)
# [1] 5.627629
sd(y2)
#[1] 54.76167

# thus, we need an axis w/ 25.62 showing 220 instead, & where 5.63 higher
#   shows 54.76 higher instead

increments <- (mean(y0)-seq(from=0, to=50, by=10))/sd(y0)
increments
# [1]  4.5521432  2.7751960  0.9982488 -0.7786983 -2.5556455
# [6] -4.3325927
newTicks   <- mean(y2) - increments*sd(y2)
newTicks
# [1] -29.24281  68.06579 165.37438 262.68298 359.99158
# [6] 457.30017

# the bottom of the y axis in the plot is 4.55 sd's below y0's mean, 
#   thus the bottom of the new axis should be about -30, and the top of 
#   the new axis should be about 460

  axis(side=4, at=seq(0, 50, 10), labels=round(newTicks), col="green")
  legend("bottomleft", c("y0 (left axis)", "y1 (left axis)", 
         "y2 (right axis)"), pch=c(6, 18, 24), lty=1, 
         col=c("blue", "red", "green"))

enter image description here

All of this is a bit of a pain. From @Carl Wittholf's answer, I gather the function plotyy() will do this for you automatically (I've never used it), but you will have to install (and subsequently load) the pracma package first.

gung - Reinstate Monica
  • 11,583
  • 7
  • 60
  • 79