1

So I'm trying to create a plot of the median monthly income of the US population by my monthly income in the same years. I also wanted to show the changes in my monthly income based on degree level. So far, this is the code I have:

ggplot(data = Earnings_Year)+
  geom_smooth(mapping = aes(x=Year,y=Month_USD))+
  geom_point(mapping = aes(x=Year,y=Month_USD,color=Degree))+
  geom_smooth(mapping = aes(x=Year,y=Median_Household_Income_US))+
  geom_point(mapping = aes(x=Year,y=Median_Household_Income_US))+
  labs(title = "Earning Comparison to Population",
       subtitle = "Individual vs Population Median 2006-2021",
       caption = "*Statistica Research Department, Jan. 20, 2021.")+
  theme_cleveland()+
  theme(plot.title = element_text(color = "blue",
                                  size = 16,
                                  face = "bold"),
        plot.subtitle = element_text(size=10,
                                     face = "bold"),
        plot.caption = element_text(face = "italic"))

Here is a picture of what it produces:

example_image.png

My issue is that I have a legend for the degrees I obtained, but I don't have a legend for the top line (or in other words, I dont have anything in the legend that specifies that the top line is median household income). Any fix you can suggest would be appreciated. Thanks!

Here is what dput spits out:

data <- structure(list(Year = 2006:2021, Month_USD = c(1160L, 1240L, 
1360L, 1480L, 1320L, 1320L, 375L, 1600L, 2000L, 2000L, 1600L, 
2240L, 1900L, 2300L, 2900L, 2300L), Degree = c("High School", 
"High School", "High School", "High School", "High School", "High School", 
"High School", "High School", "High School", "BA", "BA", "BA", 
"BA", "BA", "M.Ed", "M.Ed"), Country = c("USA", "USA", "USA", 
"USA", "USA", "USA", "DE", "USA", "USA", "USA", "USA", "USA", 
"PRC", "PRC", "PRC", "HK"), Job = c("Disher", "Prep", "Prep", 
"Prep", "Prep", "Prep", "Au Pair", "CSA", "Valet", "Valet", "Intake", 
"CM", "Teacher", "Teacher", "Teacher", "Student"), Median_Household_Income_US = c(4833L, 
4961L, 4784L, 4750L, 4626L, 4556L, 4547L, 4706L, 4634L, 4873L, 
5025L, 5218L, 5360L, 5725L, NA, NA), US_Home_Price_Index = c(183.24, 
173.36, 152.56, 146.69, 140.64, 135.16, 143.88, 159.3, 166.5, 
175.17, 184.51, 195.99, 204.9, 212.59, 236.31, NA)), class = "data.frame", row.names = c(NA, 
-16L))
Shawn Hemelstrand
  • 2,676
  • 4
  • 17
  • 30
  • 1
    To provide a [reproducible](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) example you need to copy/paste the output from `dput(Earnings_Year)`. If it is too large, you can use `head()`, `subset()` to share only a sample of your dataset. Alternatively, you can use base R datasets (to see complete list `library(help = "datasets")`). Legend problems with ggplot are very often related to how you use `aes()` and how tidy is your data. – Paul Aug 10 '21 at 12:57
  • Ah sorry about that. I edited it again with what came out of the dput command. Thanks! – Shawn Hemelstrand Aug 10 '21 at 23:54
  • I think I came up with an answer. Thank you for your knowledge! – Shawn Hemelstrand Aug 11 '21 at 03:24

2 Answers2

1

Your question would be a lot clearer if you included a sample of your data (e.g. using dput(head(Earnings_Year))) but perhaps this reproducible example will help you solve your problem:

library(tidyverse)

dat1 <- tibble(year = 2001:2021,
               pop_mean = sample(4000:6000, size = 21),
               ind_earnings = sample(1500:3000, size = 21),
               degree = c(rep("None", 7), rep("BA", 7), rep("M.Ed", 7)))

ggplot(dat1, aes(x = year)) +
  geom_smooth(aes(y=ind_earnings))+
  geom_point(aes(y=ind_earnings, color=degree))+
  geom_smooth(aes(y=pop_mean))+
  geom_point(aes(y=pop_mean, fill = "Median\nhousehold\nincome"))+
  labs(title = "Earning Comparison to Population",
       subtitle = "Individual vs Population Median 2006-2021",
       caption = "*Statistica Research Department, Jan. 20, 2021.")+
  theme_minimal()+
  theme(plot.title = element_text(color = "blue",
                                  size = 16,
                                  face = "bold"),
        plot.subtitle = element_text(size=10,
                                     face = "bold"),
        plot.caption = element_text(face = "italic")) +
  scale_fill_discrete(name = "")
#> `geom_smooth()` using method = 'loess' and formula 'y ~ x'
#> `geom_smooth()` using method = 'loess' and formula 'y ~ x'

Created on 2021-08-10 by the reprex package (v2.0.0)

jared_mamrot
  • 22,354
  • 4
  • 21
  • 46
1

I think I was able to finally pull it off. Here is my code:

ggplot(Earnings_Year,aes(x=Year))+
  geom_smooth(aes(y=Month_USD))+
  geom_point(aes(y=Month_USD,color=Degree))+
  geom_smooth(aes(y=Median_Household_Income_US))+
  geom_point(aes(y=Median_Household_Income_US,fill="Median_Household_Income_US"))+
  labs(title = "Earning Comparison to Population",
       subtitle = "Individual vs Population Median 2006-2021",
       caption = "*Statistica Research Department, Jan. 20, 2021.",
       fill="U.S. Population Median Income",
       colour="Personal Income by Degree")+
  theme_cleveland()+
  theme(plot.title = element_text(color = "blue",
                                  size = 16,
                                  face = "bold"),
        plot.subtitle = element_text(size=10,
                                     face = "bold"),
        plot.caption = element_text(face = "italic"))

Here is what it looks like: Graph Picture

Shawn Hemelstrand
  • 2,676
  • 4
  • 17
  • 30