0

I have just started in these past few weeks and this is my very first post. I have tried to implement a previous solution posted on stackoverflow without success. The link to the stackoverflow code is: Add regression line equation and R^2 on graph

I have a with an independent variable which is a timestamp in format, starting in: '2011-05-27 16:00:02' and ending '2020-02-18 19:00:34'. When calling the linear model function lm() within the function I am able to observe the correct trend line for the plot. However, I don't seem to be able to add the line equation and the R^2 value to the plot. I think it may have something to do with the x-value in the geom_text function because this x-axis is in format.

An additional problem arises when calling the linear model function lm() on it's own. It doesn't give the correct y-intercept. I think this is because it is regressing back to 1970 because of the format. Any help would be greatly appreciated.

 ##Extracting hourly data from 5 minute data
#
data <- filter(rawdata, grepl(":00:",timestamp)) %>% droplevels
#
##Covert timestamp factor to POSIXct
newtimestamp <- as.POSIXct(data$timestamp,format='%Y-%m-%d %H:%M:%S')
#
##Stackoverflow code for adding regression line equation and R^2 value to plot
lm_eqn <- function(data){
  m <- lm(demand ~ newtimestamp, data = data);
  eq <- substitute(italic(y) == a + b %.% italic(x)*","~~italic(r)^2~"="~r2, 
                   list(a = format(unname(coef(m)[1]), digits = 2),
                        b = format(unname(coef(m)[2]), digits = 2),
                        r2 = format(summary(m)$r.squared, digits = 3)))
  as.character(as.expression(eq));
}
#
#
## Visualization - plotting total demand against time using ggplot2

v1 <- ggplot(data=data, aes(x=newtimestamp, y=demand)) +
  geom_point(col="steelblue", size=0.0005, show.legend = T) +
  geom_smooth(method="lm", col="red") +
  scale_x_datetime(breaks = date_breaks('1 year'),labels=date_format('%Y')) +
  ggtitle("Total Demand over Time") + 
  ylab("Demand") +
  xlab("Time") +
  ylim(c(10000, 60000))   # deletes outliers on the y-axis
v1
v2 <- v1 + geom_text(x = '2011-05-27 16:00:02', y = 50000,label = lm_eqn(data), parse = TRUE)
v2
massisenergy
  • 1,764
  • 3
  • 14
  • 25
  • 2
    Try `geom_text(x = as.POSIXct('2011-05-27 16:00:02'), y = ...` etc – Allan Cameron Feb 26 '20 at 12:41
  • 1
    It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. – MrFlick Feb 26 '20 at 16:35

0 Answers0