2

I'm trying to add a legend to a plot that I've created using ggplot. I load the data in from two csv files, each of which has two columns of 8 rows (not including the header).

I construct a data frame from each file which include a cumulative total, so the dataframe has three columns of data (bv, bin_count and bin_cumulative), 8 rows in each column and every value is an integer.

The two data sets are then plotted as follows. The display is fine but I can't figure out how to add a legend to the resulting plot as it seems the ggplot object itself should have a data source but I'm not sure how to build one where there are multiple columns with the same name.

library(ggplot2)

i2d <- data.frame(bv=c(0,1,2,3,4,5,6,7), bin_count=c(0,0,0,2,1,2,2,3), bin_cumulative=cumsum(c(0,0,0,2,1,2,2,3)))
i1d <- data.frame(bv=c(0,1,2,3,4,5,6,7), bin_count=c(0,1,1,2,3,2,0,1), bin_cumulative=cumsum(c(0,1,1,2,3,2,0,1)))


c_data_plot <- ggplot() + 
  geom_line(data = i1d, aes(x=i1d$bv,  y=i1d$bin_cumulative), size=2, color="turquoise") +
  geom_point(data = i1d, aes(x=i1d$bv, y=i1d$bin_cumulative), color="royalblue1", size=3) + 
  geom_line(data = i2d, aes(x=i2d$bv,  y=i2d$bin_cumulative), size=2, color="tan1") +  
  geom_point(data = i2d, aes(x=i2d$bv, y=i2d$bin_cumulative), color="royalblue3", size=3) +
  scale_x_continuous(name="Brightness", breaks=seq(0,8,1)) +
  scale_y_continuous(name="Count", breaks=seq(0,12,1)) + 
  ggtitle("Combine plot of BV cumulative counts")

c_data_plot

I'm fairly new to R and would much appreciate any help.

Per comments, I've edited the code to reproduce the dataset after it's loaded into the dataframes.

Regarding producing a single data frames, I'd welcome advice on how to achieve that - I'm still struggling with how data frames work.

Sanger99
  • 125
  • 1
  • 1
  • 8
  • please provide a reproducible example. – Cyrus Mohammadian Aug 29 '16 at 10:23
  • also at a first glance, consider including your ``size`` and/or ``color`` options inside the aesthetics call. – Cyrus Mohammadian Aug 29 '16 at 10:25
  • 1
    @Cyrus Mohammadian Exactly! Your "abusing" ggplot at the moment because you seperate to much. Try to make one data frame with all information in it and than add most information in the ggplot() statement, as a start. – Huub Hoofs Aug 29 '16 at 10:27
  • Does this answer your question? [Add legend to ggplot2 line plot](https://stackoverflow.com/questions/10349206/add-legend-to-ggplot2-line-plot) – stefan Mar 10 '21 at 21:19

1 Answers1

4

First, we organize the data by combining i1d and i2d. I've added a column data which stores the name of the original dataset.

restructure data

i1d$data <- 'i1d'
i2d$data <- 'i2d'
i12d <- rbind.data.frame(i1d, i2d)

Then, we create the plot, using syntax that is more common to ggplot2:

create plot

ggplot(i12d, aes(x = bv, y = bin_cumulative))+
    geom_line(aes(colour = data), size = 2)+
    geom_point(colour = 'royalblue', size = 3)+
    scale_x_continuous(name="Brightness", breaks=seq(0,8,1)) +
    scale_y_continuous(name="Count", breaks=seq(0,12,1)) + 
    ggtitle("Combine plot of BV cumulative counts")+
    theme_bw()

If we specify x and y within the ggplot function, we do not need to keep rewriting it in the various geoms we want to add to the plot. After the first three lines I copied and pasted what you had so that the formatting would match your expectation. I also added theme_bw, because I think it's more visually appealing. We also specify colour in aes using a variable (data) from our data.frame

enter image description here

If we want to take this a step further, we can use the scale_colour_manual function to specify the colors attributed to the different values of the data column in the data.frame i12d:

ggplot(i12d, aes(x = bv, y = bin_cumulative))+
    geom_line(aes(colour = data), size = 2)+
    geom_point(colour = 'royalblue', size = 3)+
    scale_x_continuous(name="Brightness", breaks=seq(0,8,1)) +
    scale_y_continuous(name="Count", breaks=seq(0,12,1)) + 
    ggtitle("Combine plot of BV cumulative counts")+
    theme_bw()+
    scale_colour_manual(values = c('i1d' = 'turquoise',
                                   'i2d' = 'tan1'))

enter image description here

Community
  • 1
  • 1
bouncyball
  • 10,631
  • 19
  • 31
  • Thanks much for your help - that's pretty much perfect. In the ggplot() aes call ( aes(x = bv, y = bin_cumulative) ), is x=bv and y=bin_cumulative instructing R to look for those columns in the data frame? – Sanger99 Aug 29 '16 at 13:32
  • @Sanger99 In a sense, yes. Check out [this cheatsheet](https://www.rstudio.com/wp-content/uploads/2015/03/ggplot2-cheatsheet.pdf) for more `ggplot2` information – bouncyball Aug 29 '16 at 13:35
  • Ok, I'll check that out - thanks again for your help! – Sanger99 Aug 29 '16 at 13:43