0

I need to create multiple regression lines on a scatterplot that I made with ggplot2 that has two values, male and female, however I'm having a bit of trouble trying to set up the code. I created a scatterplot that shows two sets of data using the following code:

ggplot(LifeSatisfaction, aes(x=Country)) + 
  geom_point(aes(y = Life_Satisfaction_Female), color = "palevioletred2") + 
  geom_smooth(method = "lm", y = Life_Satisfaction_Female, col = "red") +
  geom_point(aes(y = Life_Satisfaction_Male), color="steelblue") +
  geom_smooth(method = "lm", y = Life_Satisfaction_Male, col = "blue") +
  labs (title = "Life Satisfaction per Country", x = "Country", y = "Life Satisfaction Rating") + ylim(5, 8)

I tried using geom_smooth (), however it doesn't seem to be working with multiple values. Any and all help is appreciated!

Edit: just wanted to say that I'm very new to rstudio and coding in general so please explain in simple terms haha

Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
winn
  • 1
  • 1

1 Answers1

0

In geom_point, the y aesthetic is mapped to the Life_Satisfaction columns, so it gets its values from that column. If you do the same thing for geom_smooth (put y=Life_Satisfaction_X inside aes()), it should get the data from that column correctly:

ggplot(LifeSatisfaction, aes(x=Country)) + 
  geom_point(aes(y = Life_Satisfaction_Female), color = "palevioletred2") + 
  geom_point(aes(y = Life_Satisfaction_Male), color = "steelblue") +
  geom_smooth(aes(y = Life_Satisfaction_Female), method = "lm", col = "red") +
  geom_smooth(aes(y = Life_Satisfaction_Male), method = "lm", col = "blue") +
  labs(title = "Life Satisfaction per Country", x = "Country", y = "Life Satisfaction Rating") +
  ylim(5, 8)
user102162
  • 822
  • 1
  • 8
  • 9
  • Hi, thanks so much for answering, but I tried running your code but its giving me the error message "`geom_smooth()` using formula 'y ~ x'." Any idea how to fix this? – winn Dec 08 '20 at 02:17
  • Does it display the plot? Is it an error message or just an info message? – user102162 Dec 08 '20 at 02:21
  • It displays the plot, not sure if its info or error, it just displays "geom_smooth() using formula 'y ~ x' " in red letters. The plot shows up fine, but there's no regression line. – winn Dec 08 '20 at 02:24
  • Can you post your data, or a sample of your data that's enough to reproduce what you're seeing, or fake data that reproduces the problem? See the first answer here: https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – user102162 Dec 08 '20 at 02:26
  • Here's the first couple of data from the full set, I imported an .csv file into rstudio and attached it Edit: Whoop, didn't realize how it would format in the comments, first value after a country is female satisfaction, and second value is male satisfaction Country Life_Satisfaction_Female Life_Satisfaction_Male Australia 7.4 7.2 Austria 7.1 7.1 Belgium 6.9 6.9 Canada 7.4 7.3 Chile 6.5 6.5 Czech Republic 6.7 6.7 Denmark 7.6 7.5 Estonia 5.7 5.8 – winn Dec 08 '20 at 02:28
  • Is your `Country` variable a list of country names? What do you want the graph to show? – user102162 Dec 08 '20 at 02:32
  • Yes, the country variable is a list of country names, and the y values satisfaction for each gender, male and female, so I have a multiple layered scatter plot. I am trying to create two regression lines, one showing the correlation between male values and one showing the correlation between female values. Sorry if I'm giving a bad explanation, but idk how to explain it. – winn Dec 08 '20 at 02:37
  • The correlation between male values and what other variable? A regression would tell you something like: as variable X increases by one point, male life satisfaction increases by 0.4 points. What is variable X? Maybe - do you just want to connect the points with a line? If so, use `geom_path()` or `geom_line()` instead of `geom_smooth()` – user102162 Dec 08 '20 at 02:43