0

Edit: Added data.

As a previous Stata-user and coding-newbie I'm having fun & trouble with the transition to R. I'm trying to make a line plot based on some of the variables in the dataframe seen in the screenshot. I've got the lines going the way I want them to, but adding a simple legend where I get to label each linetype is not at all going well. I can't even get a legend to show up! I'm fairly certain I'm not getting the level of abstraction used by ggplot here, but I'm also completely certain I want to specify this stuff manually to get the format right across multiple graphs for the publication.

By all means tell me if my whole approach is wrong, but if there is a line or two I can add that clears this up instead of adding more levels of abstraction I'd be a very happy camper.

(I have tried many solutions from similar questions on Stack Overflow and elsewhere, but I'm just not getting it.)

 codingsuccess <- ggplot(data = ystats, aes(x=iyear)) + 
  geom_line(aes(y = pnattacks), linetype = "dotted", size = 1) +
  geom_line(aes(y = ptra), linetype = "longdash", size = 1) + 
  geom_line(aes(y = pdom), linetype = "F1", size = 1) + 
  geom_line(aes(y = punc), linetype = "solid", size = 1) +
  labs(title = "Coding Success", x = "Year", y = "Percentage") +
  theme_bw() 

codingsuccess # View plot

Data

    structure(list(pnattacks = c(96.6954022988506, 94.229722373435, 
95.4063604240283, 93.9429464634623, 94.5975744211687, 96.4044943820225, 
96.3838166845686, 93.6634494334872, 92.4137931034483, 95.6087824351297, 
89.628349178911, 93.6086529006883, 93.4337997847148, 95.7178841309824, 
93.8461538461539, 96.2779156327543, 95.0248756218905, 96.039603960396, 
96.7592592592593, 96.1538461538462, 96.219035202086, 95.4599761051374, 
86.3636363636364, 94.058229352347, 95.1696377228292, 94.8897256589564, 
93.6298076923077, 91.9762258543834, 87.6906318082789, 89.1412056151941
), ptra = c(91.6216216216216, 94.2408376963351, 94.6564885496183, 
90.9090909090909, 87.7952755905512, 94.5887445887446, 97.5450081833061, 
96.7051070840198, NaN, 97.6311336717428, 93.2668329177057, 95.9090909090909, 
93.3920704845815, 99.4413407821229, 97.0588235294118, 98.7421383647799, 
95.4022988505747, 98.4455958549223, 95.852534562212, 95.4022988505747, 
94.9074074074074, 94.6341463414634, 93.1578947368421, 94.5205479452055, 
94.0639269406393, 88.5826771653543, 91.554054054054, 89.3041237113402, 
85.7374392220421, 91.2866449511401), pdom = c(98.116539140671, 
97.3818181818182, 98.1110475100172, 97.2609561752988, 97.7892756349953, 
98.0813953488372, 97.5047080979284, 95.0148367952522, NaN, 96.2833914053426, 
94.4610778443114, 94.6575342465753, 94.4532488114105, 99.5024875621891, 
92.375366568915, 96.4285714285714, 98.0952380952381, 96.9072164948454, 
97.2027972027972, 98.1060606060606, 97.9779411764706, 98.3660130718954, 
96.9072164948454, 95.5916473317865, 98.2203969883641, 97.3514211886305, 
95.5904334828102, 94.8237394020527, 88.2456915598763, 88.370142577579
), punc = c(15.8536585365854, 23.1884057971014, 41.5730337078652, 
0.641025641025641, 25, 20.7920792079208, 19.8412698412698, 22.5641025641026, 
0, 31.9587628865979, 21.0526315789474, 43.4782608695652, 49.5867768595041, 
10.5263157894737, 7.69230769230769, 9.09090909090909, 0, 40, 
8.69565217391304, 0, 0, 7.31707317073171, 0.75187969924812, 1.96078431372549, 
14.2857142857143, 25.1968503937008, 7.01754385964912, 50.6398537477148, 
60.5584642233857, 73.558981233244), iyear = 1985:2014), .Names = c("pnattacks", 
"ptra", "pdom", "punc", "iyear"), row.names = c(NA, 30L), class = "data.frame")
  • 2
    Possible duplicate of [Missing legend with ggplot2 and geom\_line](http://stackoverflow.com/questions/5027016/missing-legend-with-ggplot2-and-geom-line) – lbusett Mar 19 '17 at 14:04
  • 2
    You need to re-organize your data from wide-format to long-format, perhaps using the `reshape2` or `tidyr` package. Provide an example dataset then maybe we can demonstrate you how to do it. – www Mar 19 '17 at 14:09
  • Please have a look here [http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example] to see how you can share your data. – lbusett Mar 19 '17 at 15:56

1 Answers1

0

As suggested in the comments, tidyr::gather will get your data from wide to long format, which makes ggplot much easier.

I'd suggest color rather than linetype to distinguish the groups, but here it is with linetype:

library(tidyr)
library(ggplot2)

ystats %>% 
  gather(coding_success, percentage, -iyear) %>% 
  ggplot(aes(iyear, percentage)) + 
    geom_line(aes(linetype = coding_success), size = 1) +
    scale_linetype_manual(values = c("F1", "dotted", "longdash", "solid")) +
    labs(x = "Year", y = "percentage", title = "Coding Success")

Result: enter image description here

neilfws
  • 32,751
  • 5
  • 50
  • 63