4

I was wondering if it was possible to create a legend box for a graph that contains plots of multiple series using ggplot in R. Essentially, this is what I'm doing.

x <- c(1,2,3,4)
y <- c(1.1,1.2,1.3,1.4)
y2 <- c(2.1,2.2,2.3,2.4)
x3 <- c(4,5,6,7)
y3 <- c(3.1,3.2,3.3,3.2)
p1 <- data.frame(x=x,y=y)
p2 <- data.frame(x=x,y=y2)
p3 <- data.frame(x=x3,y=y3)

ggplot(p1, aes(x,y)) + geom_point(color="blue") + geom_point(data=p2, color="red") + geom_point(data=p3,color="yellow") 

The command above will make a graph of all three data sets, p1, p2, and p3 in three different colors. I know I haven't as yet specified the names of each data set, but how would I go about creating a legend that identifies the different data sets? In other words, I just want a legend that says that all blue points are P1, all red points are P2, and all yellow points are P3.

Chase
  • 67,710
  • 18
  • 144
  • 161
Miguel
  • 41
  • 1
  • 2

1 Answers1

5

You need to turn them into a single data.frame and map the colour aesthetic to which dataset the points come from. You can use melt from reshape package to make the single data.frame:

zz <- melt(list(p1 = p1, p2 = p2, p3 = p3), id.vars = "x")

ggplot(zz, aes(x, value, colour = L1)) + geom_point() +
    scale_colour_manual("Dataset", values = c("p1" = "blue", "p2" = "red", "p3" = "yellow"))

enter image description here

Chase
  • 67,710
  • 18
  • 144
  • 161
  • can you do this without melting the data together?? – theforestecologist Jul 12 '17 at 01:18
  • @theforestecologist - Probably, but not easily. You could create different geom_point() layers for each dataset and assign colours manually, but you'd end up building everything from scratch and losing out on the thoughtful defaults that ggplot2 puts in place for scales / axes, legends, etc. Maybe someone who loves an arbitrary ggplot2 challenge would take that up, but much easier to go with the melt route IMO. – Chase Jul 14 '17 at 20:36