2

I am very new to R and I am working on a voluntary project to predict some parameters based on similar observations for several categorical types. For example, same parameters for different people, but of course, different values for each person.

I got this output from ggplot:

enter image description here

My questions are:

  1. How do I get this to plot a straight line?

  2. From this plot, does it appear I am doing something wrong and if so, is it related to the lm function?

camille
  • 16,432
  • 18
  • 38
  • 60
COG
  • 23
  • 1
  • 3
  • Hi COG. What do you mean by "straight line"? What was your code? Can you share a sample of the data? – iod Jun 29 '18 at 16:42
  • 1
    Are you talking about a regression line, e.g. Excel's trendline? If so, see `geom_smooth(., method = "lm")`. Also, could you supply us with your data (https://stackoverflow.com/help/mcve)? – jordan Jun 29 '18 at 16:55
  • I am talking about a regression line pretty much. Here is part of my code. I added geom_smooth but did not see the line. regressor <- lm(formula = Cum_cost ~ Depth + ......, data = dataset) ggplot() + geom_point(aes(x = dataset$Depth, y = dataset$Cum_cost), colour = 'red') + geom_line(aes(x = dataset$Depth, y = predict(regressor, newdata = dataset)), colour = 'blue') – COG Jun 30 '18 at 04:27
  • I modified my code and go the geom_smooth to show. Thanks! @jordan I am looking for how to accept your answer but have not seen it. ggplot(dataset, aes(x = dataset$Depth, y = dataset$Cum_Cost)) + geom_point(colour = 'red') + geom_smooth(method = lm) + geom_line(aes(x = dataset$Depth, y = predict(regressor, newdata = dataset)), colour = 'blue') – COG Jun 30 '18 at 04:42

1 Answers1

1

Since the OP did not provide a MRE (see here for how to make one in R), I'm using the flights dataset from the nycflights13 package.

library(ggplot2)
library(dplyr)
library(lubridate)
library(nycflights13) # https://github.com/hadley/nycflights13

dataset <- 
  flights %>% 
  # create departure date
  mutate(departure = make_date(year, month, day)) %>% 
  # calculate average departure delay
  group_by(departure) %>% 
  summarize(dep_delay_mean = mean(dep_delay, na.rm = TRUE)) %>% 
  # remove outlier
  filter(dep_delay_mean < 60)

head(dataset)

# A tibble: 6 x 2
  departure  dep_delay_mean
  <date>              <dbl>
1 2013-01-01          11.5 
2 2013-01-02          13.9 
3 2013-01-03          11.0 
4 2013-01-04           8.95
5 2013-01-05           5.73
6 2013-01-06           7.15

ggplot(data = dataset, aes(x = departure, y = dep_delay_mean)) +
  geom_point(colour = "red") +
  geom_line(colour = "blue") +
  geom_smooth(method = "lm", colour = "orange", se = FALSE) +
  theme_minimal()

enter image description here

jordan
  • 388
  • 3
  • 14