3

I would like to create a line graph that shows how the trend of five air pollutants were during the years 2009 to 2019.

Year CO2 NO2 O3 PM2.5
2009 30 18 20 30
2010 32 16 22 20
2011 33 16 24 20
2012 32 15 25 22
2013 34 14 27 24
2014 36 14 28 22
2015 38 13 29 20
2016 39 13 30 18
2017 40 12 32 16
2018 44 13 34 15
2019 45 11 38 14

I gave that code but it is a histogram, i would like to have a line graph were all four are in the same plot.

df %>%
  ggplot(aes(x = Year, y = n, fill = airpollutants)) +
  geom_col() +
  facet_wrap(~Year)  + ggtitle("trend of airpollutants")

I want this output: https://cdn.ablebits.com/_img-blog/line-graph/line-graph-excel.png

socialscientist
  • 3,759
  • 5
  • 23
  • 58

3 Answers3

3

You could reshape your data from wide to long and colour every airpollutants like this:

df <- read.table(text = "Year   CO2 NO2 O3  PM2.5
2009    30  18  20  30
2010    32  16  22  20
2011    33  16  24  20
2012    32  15  25  22
2013    34  14  27  24
2014    36  14  28  22
2015    38  13  29  20
2016    39  13  30  18
2017    40  12  32  16
2018    44  13  34  15
2019    45  11  38  14
", header = TRUE)

library(ggplot2)
library(dplyr)
library(reshape)
df %>%
  melt(id = "Year") %>%
  mutate(variable = as.factor(variable)) %>%
  ggplot(aes(x = Year, y = value, colour = variable)) +
  geom_line() +
  labs(colour = "airpollutants") +
  ggtitle("trend of airpollutants")

Created on 2022-07-26 by the reprex package (v2.0.1)

Quinten
  • 35,235
  • 5
  • 20
  • 53
3

Usually you'll want to be in long format when plotting in ggplot2.

One way to draw multiple lines without going long is to map over the columns

ggplot(data = df) + purrr::map2(df[-1], names(df[-1]), \(x,y) geom_line(aes(x = df$Year, y = x, col = y))) +
  labs(x = "Concentration",
       y = "Year",
       col = "Pollutant")

enter image description here

2
set.seed(123)
library(ggplot2)
library(tidyr)

# Example data
df <- data.frame(year = 2009:2019,
                 CO2 = sample(30:40, 11),
                 NO2 = sample(10:20, 11),
                 O3 = sample(20:30, 11),
                 PM2.5 = sample(15:25, 11))

# Convert to long format
df_long <- pivot_longer(df, 
                        cols = c(CO2, NO2, O3, PM2.5), 
                        values_to = "Concentration",
                        names_to = "Pollutant")


# Plot
ggplot(df_long,
       aes(
         x = year,
         y = Concentration,
         color = Pollutant,
         linetype = Pollutant
       )) +
  geom_line(size = 0.7) +
  ggtitle("Trend of Airpollutants") +
  xlab("Year") +
  ylab("Concentration") +
  scale_x_continuous(breaks = seq(2009, 2019, by = 1), limits = c(2009,2019)) +
  theme_minimal()

socialscientist
  • 3,759
  • 5
  • 23
  • 58
  • 1
    how can I change it that the lines are not dotted, just line? –  Jul 26 '22 at 12:30
  • 1
    Remove the `linetype = Pollutant` line! That's letting your groups (pollutants) affect the line types as well as the colors of the lines. Note that it is often recommended to try to use different line types because the people who look at your figure may be colorblind or someone might print off your figure in black and white -- changing the line type helps a little with making your work more accessible. – socialscientist Jul 26 '22 at 12:31
  • 2
    perfect many thanks. I am very new to R –  Jul 26 '22 at 12:32