4

I have a data frame df which looks like this

    > dput(head(df))
    structure(list(Percent = c(1, 2, 3, 4, 5), Test = c(4, 2, 3, 
    5, 2), Train = c(10, 12, 10, 13, 15)), .Names = c("Percent", 
    "Test", "Train"), row.names = c(NA, 5L), class = "data.frame")

Looks like this

Percent    Test    Train
1          4       10
2          2       12
3          3       10
4          5       13
5          2       15

How can I plot Test and Train into a two lines with ggplot ?

I've got something like this right now

ggplot(dfk, aes(x = Percent, y = Test)) + geom_point() + 
  geom_line() 

I also want to add Train points and line connected onto the plot and have them a different color with labels in a legend. I am not sure how to do this.

enter image description here

Liondancer
  • 15,721
  • 51
  • 149
  • 255

1 Answers1

7

There are two ways, either add layers or restructure your data beforehand.

Adding layers:

ggplot(df, aes(x = Percent)) + 
  geom_point(aes(y = Test), colour = "red") + 
  geom_line(aes(y = Test), colour = "red") + 
  geom_point(aes(y = Train), colour = "blue") + 
  geom_line(aes(y = Train), colour = "blue")

Restructure your data:

# df2 <- tidyr::gather(df, key = type, value = value, -Percent) # Old way
df2 <- tidyr::pivot_longer(df, -Percent, names_to = "type", values_to = "value") # New way

ggplot(df2, aes(x = Percent, y = value, colour = type)) +
 geom_point() +
 geom_line()

Option 2 is generally preferred because it plays to ggplot2's strengths and elegance.

Phil
  • 7,287
  • 3
  • 36
  • 66