26

I have a data set with some points in it and want to fit a line on it. I tried it with the loess function. Unfortunately I get very strange results. See the plot bellow. I expect a line that goes more through the points and over the whole plot. How can I achieve that? plot

How to reproduce it:

Download the dataset from https://www.dropbox.com/s/ud32tbptyvjsnp4/data.R?dl=1 (only two kb) and use this code:

load(url('https://www.dropbox.com/s/ud32tbptyvjsnp4/data.R?dl=1'))
lw1 = loess(y ~ x,data=data)
plot(y ~ x, data=data,pch=19,cex=0.1)
lines(data$y,lw1$fitted,col="blue",lwd=3)

Any help is greatly appreciated. Thanks!

leo
  • 3,677
  • 7
  • 34
  • 46
  • I tried to download the data file. It did download, but I could not read it. What format is it in? Could you upload an ASCII DOS text file? Maybe I am doing something wrong. Maybe I need to have DropBox installed on my machine to read the downloaded file? Thanks. – Mark Miller Jul 07 '15 at 08:04

3 Answers3

67

You've plotted fitted values against y instead of against x. Also, you will need to order the x values before plotting a line. Try this:

lw1 <- loess(y ~ x,data=data)
plot(y ~ x, data=data,pch=19,cex=0.1)
j <- order(data$x)
lines(data$x[j],lw1$fitted[j],col="red",lwd=3)

enter image description here

Carl
  • 5,569
  • 6
  • 39
  • 74
Rob Hyndman
  • 30,301
  • 7
  • 73
  • 85
4

Unfortunately the data are not available anymore, but an easier way how to fit a non-parametric line (Locally Weighted Scatterplot Smoothing or just a LOESS if you want) is to use following code:

scatter.smooth(y ~ x, span = 2/3, degree = 2)

Note that you can play with parameters span and degree to get arbitrary smoothness.

HonzaB
  • 7,065
  • 6
  • 31
  • 42
  • 1
    The data are back, thanks for the hint. Probably you can change your answer accordingly. – leo May 18 '17 at 07:23
4

May be is to late, but you have options with ggplot (and dplyr). First if you want only plot a loess line over points, you can try:

library(ggplot2)
load(url("https://www.dropbox.com/s/ud32tbptyvjsnp4/data.R?dl=1"))
ggplot(data, aes(x, y)) + 
geom_point() +
geom_smooth(method = "loess", se = FALSE)

Loess line with <code>ggplot::geom_smooth()</code>

Other way, is by predict() function using a loess fit. For instance I used dplyr functions to add predictions to new column called "loess":

  library(dplyr)
  data %>%
  mutate(loess = predict(loess(y ~ x, data = data))) %>%
  ggplot(aes(x, y)) +
  geom_point(color = "grey50") +
  geom_line(aes(y = loess))

Loess line with <code>predict()</code> and <code>geom_line()</code>

Update: Added line of code to load the example data provided Update2: Correction on geom_smoot() function name acoording @phi comment

gavg712
  • 300
  • 1
  • 10
  • 2
    Commenting on an old post, but I thought I'd point out that there is a typo in the code. "geom_smoth" should be "geom_smooth". Since typos in code makes the code unable to run, it is worth pointing out. – Phil Nov 13 '18 at 14:12
  • Why doesn't we get the same curves between `ggplot` and `loess`? – Julien Nov 04 '22 at 09:12
  • `geom_smooth()` predicts for a internal sequence generated for the entire range of `x`, while loess in the example predicts only for the unique `x` values existing in the dataset. – gavg712 Nov 05 '22 at 15:06