0

My problem: I need to make many line graphs. Each line graph will contain 12 lines. Each line corresponds to a location. The code will be reused to make 15-20 such plots with standardized formatting. Thus, I am attempting to write some code that is reuseable. I do not want to write bespoke code for each variable.

Here is my data:

tmp <- structure(list(Year = c(1971, 1976, 1981, 1986, 1991, 1996, 2001, 
2006, 2011, 2016, 2021, 1971, 1976, 1981, 1986, 1991, 1996, 2001, 
2006, 2011, 2016, 2021, 1971, 1976, 1981, 1986, 1991, 1996, 2001, 
2006, 2011, 2016, 2021, 1971, 1976, 1981, 1986, 1991, 1996, 2001, 
2006, 2011, 2016, 2021, 1971, 1976, 1981, 1986, 1991, 1996, 2001, 
2006, 2011, 2016, 2021, 1971, 1976, 1981, 1986, 1991, 1996, 2001, 
2006, 2011, 2016, 2021, 1971, 1976, 1981, 1986, 1991, 1996, 2001, 
2006, 2011, 2016, 2021, 1971, 1976, 1981, 1986, 1991, 1996, 2001, 
2006, 2011, 2016, 2021, 1971, 1976, 1981, 1986, 1991, 1996, 2001, 
2006, 2011, 2016, 2021, 1971, 1976, 1981, 1986, 1991, 1996, 2001, 
2006, 2011, 2016, 2021, 1971, 1976, 1981, 1986, 1991, 1996, 2001, 
2006, 2011, 2016, 2021, 1971, 1976, 1981, 1986, 1991, 1996, 2001, 
2006, 2011, 2016, 2021), variable = structure(c(1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 
5L, 5L, 5L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 7L, 7L, 
7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 
8L, 8L, 8L, 8L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 10L, 
10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L, 11L, 11L, 11L, 
11L, 11L, 11L, 11L, 11L, 11L, 11L, 11L, 12L, 12L, 12L, 12L, 12L, 
12L, 12L, 12L, 12L, 12L, 12L), .Label = c("FORT ERIE", "GRIMSBY", 
"LINCOLN", "NIAGARA FALLS", "NIAGARA-ON-THE-LAKE", "PELHAM", 
"PORT COLBORNE", "ST. CATHARINES", "THOROLD", "WAINFLEET", "WELLAND", 
"WEST LINCOLN"), class = "factor"), value = c(23113L, 24030L, 
24096L, 23253L, 26006L, 27183L, 28143L, 29925L, 29960L, 30710L, 
32901L, 15770L, 15565L, 15797L, 16956L, 18520L, 19585L, 21297L, 
23937L, 25325L, 27314L, 28883L, 14247L, 14460L, 14196L, 14391L, 
17149L, 18801L, 20612L, 21722L, 22487L, 23787L, 25719L, 67163L, 
69420L, 70960L, 72107L, 75399L, 76917L, 78815L, 82184L, 82997L, 
88071L, 94415L, 12552L, 12485L, 12186L, 12494L, 12945L, 13238L, 
13839L, 14587L, 15400L, 17511L, 19090L, 9997L, 10070L, 11104L, 
12137L, 13328L, 14343L, 15272L, 16155L, 16598L, 17110L, 18192L, 
21420L, 20535L, 19225L, 18281L, 18766L, 18451L, 18450L, 18599L, 
18424L, 18306L, 20033L, 109722L, 123350L, 124018L, 123455L, 129300L, 
130926L, 129170L, 131989L, 131400L, 133113L, 136803L, 15065L, 
14945L, 15412L, 16131L, 17542L, 17883L, 18048L, 18224L, 17931L, 
18801L, 23816L, 5486L, 6065L, 6000L, 5955L, 6203L, 6253L, 6258L, 
6601L, 6356L, 6372L, 6887L, 44397L, 45050L, 45448L, 45054L, 47914L, 
48411L, 48402L, 50331L, 50631L, 52293L, 55750L, 8396L, 9460L, 
9846L, 9918L, 10864L, 11513L, 12268L, 13167L, 13837L, 14500L, 
15454L)), row.names = c(NA, -132L), class = c("data.table", "data.frame"
), .internal.selfref = <pointer: 0x00000230e29a1ef0>)

So far, my code is this:

ggplot(data = tmp, aes(x = Year, y = value, color = variable, range(0,max(tmp$value)))) +
    geom_point() +
    geom_line() +
    labs(title = "Total population in Nigara's 12 CSDs, 1971-2021",
         subtitle = "Upward trend overall",
         x = "Census Year",
         y = "Total population",
         color = "") +
    scale_x_continuous(breaks = unique_years) +
    scale_y_continuous(breaks = scales::breaks_width(10000), labels = scales::comma) +
    theme_classic() +
    theme(legend.position = "bottom") +
    theme(plot.title.position = 'plot',
          plot.title = element_text(hjust = 0.5))

Using long format data (that apparently I cannot add here; let me know if I am missing something)...

...to produce this graph: my functional but ugly line graph

My question: Is there a way to alter the line width and line type using something like scale_color_manual(), but for lines?

Optimally, every time I draw the graph, each location (e.g., Grimsby) would always have the same kind and color of line AND I would only have to alter that code in one location instead of in each ggplot(). When I have added geom_line(aes(linetype = variable)) to the code above, it has produced ugly results, so I don't think that will work.

Jeff Boggs
  • 45
  • 6
  • 1
    Yes, there is `scale_linewidth_manual` - see the ggplot documentation. – Andrew Gustar Apr 21 '23 at 21:38
  • Thank you, @Andrew Gustar. I'll set your comment as an answer after I finish and upload the code incorporating 'scale_linewidth_manual()'. I found specific, short examples at https://www.youtube.com/watch?v=qlmCXRihxLU already. – Jeff Boggs Apr 21 '23 at 22:23
  • 3
    Uploading data: yes you can, just please not tons of data, we only really need small, representative, unambiguous data. See https://stackoverflow.com/q/5963269 , [mcve], and https://stackoverflow.com/tags/r/info for discussions on `dput`, `data.frame`, and `read.table`. – r2evans Apr 21 '23 at 23:31
  • 1
    Great to see you've found a solution. Could you also review the links in the comment by @r2evans, and post some sample data please? Doing so will helpful to others with similar issues if they stumble upon your question (and answer) in future. Many thanks. – L Tyrone May 02 '23 at 00:50
  • Thank you for the prompt. I used dput() to copy my dataset and added it above. – Jeff Boggs May 02 '23 at 01:01

1 Answers1

0

Andrew Gustar notes that yes, there is scale_linewidth_manual(), defined in the ggplot documentation.

Ultimately, I went with the following code to use directlabels::direct.label(), scale_color_manual() and scale_linetype_discrete() after creating two variables to hold color and line type information. These were:

colVec <- ('#31a354', '#08519c', '#08519c', 'Black', '#08519c', 'Black', '#31a354', '#08519c', 'Black', '#31a354', 'Black', 'Black')

ltyVec <- rep(c("solid","dotted", "dashed"),c(4,4,4))

After defining those two variables, I ran this:

library(ggplot2)
library(directlabels)
base_size = 12

p <- ggplot(data = tmp, aes(x = Year, y = value, xmax = 2046, color = variable, linetype = variable, range(0,max(tmp$value)))) +
        geom_line(linewidth = 0.4) +
        geom_point(shape = 20) +
        labs(title = "Total population in Niagara's 12 CSDs, 1971-2021",
             subtitle = "",
             caption = "Source: Statistics Canada, Census of Population, various years",
             x = "Census Year",
             y = "Total population",
             color = "") +
        scale_x_continuous(breaks = unique_years) +
        scale_y_continuous(breaks = scales::breaks_width(10000), labels = scales::comma) +
        scale_linetype_discrete(name = "CSD") +
        scale_color_manual(name = "CSD", values = colVec) +
        theme_classic(base_size = 12,
                      base_family = "Trebuchet MS", # might be unneeded
                      base_line_size = base_size/22,
                      base_rect_size = base_size/22) +
        theme(text=element_text(family="Trebuchet MS"), # needed
              legend.position = "bottom",
              axis.text.x = element_text(angle = 30, face = "bold"),
              axis.text.y = element_text(face = "bold"),
              plot.title.position = 'plot',
              plot.title = element_text(hjust = 0.5, face = "bold",
                                        size = 18)) +
        annotate("rect", xmin = 1971, xmax = 2021, ymin = 4000, ymax = 35000, alpha = .18) +
        annotate("text", x = 2006, y = 35000, label = "See Figure 2 for details", size = base_size/4)
direct.label(p, list(cex = 1,dl.trans(x=x+0.1), "last.bumpup"))

This code produced this linegraph: enter image description here It is an improvement over the line graph I posted originally.

Jeff Boggs
  • 45
  • 6
  • The colors correspond to location on the Niagara isthmus. Green CSDs border Lake Erie. Black CSDs border neither Great Lake. Blue CSDs border Lake Ontario. – Jeff Boggs May 01 '23 at 23:42