1

I have a csv file that has four columns(year, TMAX, TMEAN, and TMIN) ranging from the year 1900 to 2014. In a single window, i want to make 3 line graphs of TMAX, TMEAN, and TMIN with X axis Year (1900:2014). I also want want to show the trend lines in the graphs and thier associated r squared values in legends. So far i have written following code:

library(ggplot2)
library(reshape)
data=read.table("temp_red.csv",header=TRUE, sep=",")
frame=data.frame(data[1:4])
meltd=melt(frame,id.vars="Year")
matplot(frame[2:4], type = c("l"),col = 1:3)
ggplot(meltd, aes(x = time, y = value, colour = variable)) + geom_line()

Year    TMAX    TMEAN   TMIN
1900    11.19989107 4.684640523 -1.837690632
1901    10.26497821 4.098583878 -2.074891068
1902    10.03077342 4.025054466 -1.99291939
1903    9.378540305 2.862472767 -3.651416122
1904    8.66040305  2.659313725 -3.351579521
1905    9.703703704 3.590686275 -2.534313725
1906    9.874455338 3.795479303 -2.290305011
2014    8.599673203 2.360566449 -3.88671024

I dont know how to display Trend line with R squared value in the graph using r. Please help.

Tal J. Levy
  • 598
  • 2
  • 11
Lira
  • 53
  • 2
  • 9
  • 1
    it has been answered before please see http://stackoverflow.com/questions/7549694/ggplot2-adding-regression-line-equation-and-r2-on-graph – MLavoie Jan 03 '16 at 11:09

2 Answers2

3

I believe the following would work for you. Before I start please notice related discussions here and here. First I will generate some input:

library(dplyr)
library(ggplot2)
library(tidyr)
set.seed(1)
year <- 1990:2010
Tmax <- rnorm(21, 9)
Tmean <- rnorm(21, 3.5)
Tmin <- rnorm(21, -2)
df <- data.frame(year, Tmax, Tmean, Tmin)
df <- tbl_df(df)
df
Source: local data frame [21 x 4]

    year      Tmax    Tmean      Tmin
   (int)     (dbl)    (dbl)     (dbl)
1   1990  8.373546 4.282136 -1.303037
2   1991  9.183643 3.574565 -1.443337
3   1992  8.164371 1.510648 -2.688756
4   1993 10.595281 4.119826 -2.707495
5   1994  9.329508 3.443871 -1.635418
6   1995  8.179532 3.344204 -1.231467
7   1996  9.487429 2.029248 -2.112346
8   1997  9.738325 3.021850 -1.118892
9   1998  9.575781 3.917942 -1.601894
10  1999  8.694612 4.858680 -2.612026
..   ...    ...       ...

Next I will use tidyr to prepare the data for plotting:

df1 <- df %>% gather(key, Value, -year)
df1
Source: local data frame [63 x 3]

    year    key     Value
   (int) (fctr)     (dbl)
1   1990   Tmax  8.373546
2   1991   Tmax  9.183643
3   1992   Tmax  8.164371
4   1993   Tmax 10.595281
5   1994   Tmax  9.329508
6   1995   Tmax  8.179532
7   1996   Tmax  9.487429
8   1997   Tmax  9.738325
9   1998   Tmax  9.575781
10  1999   Tmax  8.694612
..   ...    ...       ...

And just before plotting I will extract the values of R^2 needed for the plot:

r2 <- df1 %>% group_by(key) %>% 
      do(mod = lm(Value ~ year, data = .)) %>% 
      mutate(r2sq = summary(mod)$r.squared) %>% 
      select(key, r2sq)
r2
Source: local data frame [3 x 2]
Groups: <by row>

     key       r2sq
  (fctr)      (dbl)
1   Tmax 0.03718175
2  Tmean 0.01216523
3   Tmin 0.02820540

Now to the plot:

pl <- ggplot(df1, aes(x = year, y = Value, col = key)) + geom_line() + 
    geom_smooth(method = lm)
pl + geom_text(data = r2, aes(x= 2005, y = c(11, 5, 1), 
     label = paste0("R^2 : ", round(r2sq, 3))), parse = T, 
     col = "black", show.legend = F)

The result is the following:
enter image description here

Hope this helps.

Community
  • 1
  • 1
Tal J. Levy
  • 598
  • 2
  • 11
1

You could use stat_smooth. Using your meltd dataframe

ggplot(meltd, aes(x = Year, y = value, colour = variable)) +
    geom_line() +
    stat_smooth(method = lm)

EDIT:

Using geom_smooth(method = lm) will also work.

steveb
  • 5,382
  • 2
  • 27
  • 36