0

I am trying to develop a function to use ggplot, but I am running into an issue with the aes() function and parameters I would pass into geom_point(), specifically, color and shape.

Here is a very simplified version of the code where things start to break. Some background into what the code is trying to achieve might be helpful. Assume we have data frame x with variables Dose and Response, and there is a function named OutlierDetect(Dose,Response) that returns indexes of x that seem to be an outlier to some fitted curve. The goal is to have these indexes plotted as a different shape from the rest of the data.

a <- ggplot(NULL, aes(x = Dose, y = Response)
shape.vec <- rep(19, nrow(x))
out.index <- OutlierDetect(Dose=x$Dose,Response=x$Response)
shape.vec[out.index] <- 17
a <- a + geom_point(data = x, size = 5, alpha = 0.8, aes(color = "group1"), shape = shape.vec)

I would like to avoid the seemingly obvious answer of putting a factor column in x, because I'd like to not modify x if possible. And the code is written like this so it is flexible enough to add a new set of data y, so that they are plotted on the same graph and ggplot generates the color legend automatically.

I am receiving the error Error: Aesthetics must be either length 1 or the same as the data (1): shape, size, alpha. In my own research, it seems like a bug in ggplot when making the legend. For some reason these two parameters need to be the same length to work properly (see code chunks below).

for reference, this code chunk works fine.

a <- ggplot(NULL, aes(x = Dose, y = Response)
shape.vec <- rep(19, nrow(x))
color.vec <- rep("blue", nrow(x)) #Need to manually specify new colors/color pallete which is more user input
out.index <- OutlierDetect(Dose=x$Dose,Response=x$Response)
shape.vec[out.index] <- 17
a <- a + geom_point(data = x, size = 5, alpha = 0.8, color = color.vec, shape = shape.vec)

as does this chunk

a <- ggplot(NULL, aes(x = Dose, y = Response)
a <- a + geom_point(data = x, size = 5, alpha = 0.8, aes(color = "group1"), shape = 19) 
# can add more data frames y, z, ect, so long
# as the aes color parameter has a different name

So in summary, what I would like is a mixture of these two code blocks. Any suggestions?

Reproducible example code

x <- data.frame(Dose = c(1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8,9,9,10,10),
Response = seq(20))
a <- ggplot(NULL, aes(x = Dose, y = Response))
shape.vec <- rep(19, nrow(x))
out.index <- nrow(x) # lets say the OutlierDetect function always says the last index is an outlier
shape.vec[out.index] <- 17
a <- a + geom_point(data = x, size = 5, alpha = 0.8, aes(color = "group1"),     shape = shape.vec)
Justin Landis
  • 1,981
  • 7
  • 9
  • When asking for help, you should include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. But `ggplot` works best when all the things you want to use in your `aes()` come from the data source. You don't have to change `x`, you just need to pass a version to `ggplot` that has all the columns needed for the plot. – MrFlick May 23 '18 at 14:57
  • Could you use `dput()` to show your data? It's hard to answer without seeing what your data looks like. See [How to make a great R reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) – Jan Boyer May 23 '18 at 15:10
  • Hello MrFlick, to make things simple, `Dose <- c(1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8,9,9,10,10)` and `Response <- seq(20)` and lets say the last index is considered an outlier, i.e. `shape.vec[20] <- 17` (The actual numbers do not matter, the error comes at the plot) But you are correct. I can save a copy of x and modify that one, and add a color variable and shape variable. That should work just as well. Still, I thought my original error was a bit unintuitive. – Justin Landis May 23 '18 at 15:29
  • 1
    @JustinLandis all of that code should be edited into your question, not placed in a comment – camille May 23 '18 at 15:43

1 Answers1

0

You code has nrow(x$Dose) which doesn't work as you are trying to call row number from a 1D vector. You can change it to either nrow(x) or length(x$Dose).

a <- ggplot(NULL, aes(x = Dose, y = Response))
shape.vec <- rep(19, nrow(x))
out.index <- nrow(x) # lets say the OutlierDetect function always says the last index is an outlier
shape.vec[out.index] <- 17
a <- a + geom_point(data = x, size = 5, alpha = 0.8, aes(color = "group1"),shape = shape.vec)

enter image description here

GordonShumway
  • 1,980
  • 13
  • 19
  • That is interesting that your code was able to produce this graph, because even when I run (your corrected version) I still get the same error message. Maybe it is the version of r or ggplot2 that I have, I currently have r version 3.4.4 and ggplot2 version 2.2.1. What version of ggplot2 are you running? – Justin Landis May 23 '18 at 16:28
  • I'm running R 3.3.2 and ggplot 2.2.1.9000 – GordonShumway May 24 '18 at 19:03