5

Related: R: Marking slope changes in LOESS curve using ggplot2
This question is trying to find the min/max y (slope=0); I'd like to find the min/max

For background, I'm conducting some various modelling techniques and thought I might use slope to gauge the best models produced by random seeds when iterating through neural network results.

Get the data:

nn <- read.csv("http://pastebin.com/raw.php?i=6SSCb3QR", header=T)
rbf <- read.csv("http://pastebin.com/raw.php?i=hfmY1g46", header=T)

For an example, here's the results of a trained neural network for my data:

library(ggplot2)
ggplot(nn, aes(x=x, y=y, colour=factor(group))) + 
geom_point() + stat_smooth(method="loess", se=F)

nn

Similarly, here's one rbf model:

ggplot(rbf, aes(x=x, y=y, colour=factor(group))) + 
geom_point() + stat_smooth(method="loess", se=F)

rbf

The RBF model fits the data better, and agrees better with background knowledge of the variables. I thought of trying to calculate the min/max slope of the fitted line in order to prune out NNs with steep cliffs vs. more gentle curves. Identifying crossing lines would be another way to prune, but that's a different question.

Thanks for any suggestions.


Note: I used ggplot2 here and tagged the question accordingly, but that doesn't mean it couldn't be accomplished with some other function. I just wanted to visually illustrate why I'm trying to do this. I suppose a loop could do this with y1-y0/x1-x0, but perhaps there's a better way.?

Community
  • 1
  • 1
Hendy
  • 10,182
  • 15
  • 65
  • 71
  • For a similar case, see: http://stackoverflow.com/questions/11744012/finding-the-maximum-gradient-of-a-growth-curve/11745538#11745538 – Dieter Menne Aug 29 '12 at 18:16
  • Would `numericDeriv(my_loess$y)` suffice? – Carl Witthoft Aug 29 '12 at 18:53
  • @CarlWitthoft: I'm not familiar enough with R to know what `my_loess` would be. – Hendy Aug 29 '12 at 20:23
  • @Hendy Sorry: it's a shorthand for the object `loess` returns, i.e. `my_loess <- loess(rbf)` – Carl Witthoft Aug 30 '12 at 11:05
  • @CarlWitthoft Do I need to pass more options? I did what you suggested with `my_loess <- loess(x~y, data=rbf)` and then `numericDeriv(my_loess$y)` and I get an error: `Error in length(theta) : 'theta' is missing`. Sorry for making you walk me through everything! – Hendy Aug 30 '12 at 22:27
  • @Hendy -- I sincerely apologize. I'd never used `numericDeriv` and only skimmed the help page. I tried to use it on simple data and have absolutely no idea what it's doing or what it's intended for. I found success with `sfsmisc::D1D2` which is designed specifically to calculate the derivative of `y_vector` vs `x_vector`. In this case, `D1D2(1:length(my_loess$y),my_loess$y)` works. – Carl Witthoft Aug 31 '12 at 11:52

2 Answers2

4

I think the simplest solution would be to use the first difference (using function diff) as an approximation of the first derivative.

slope.loess <-function(X, data){
    # First your loess function:
    my_loess <- loess(y~x, data=data, subset=data$group==X, degree=2)
    # Then the first difference
    first_diff <- diff(my_loess$fitted)
    # Then the corresponding x and y values for the minima and maxima
    res <- cbind(my_loess$x[c(which.min(first_diff), which.max(first_diff))], 
            my_loess$fitted[c(which.min(first_diff), which.max(first_diff))])
    colnames(res) <- c("x", "y")
    rownames(res) <- c("min", "max")
    res
    }

#Then apply the function to each group
slope.rbf <- lapply(levels(rbf$group), FUN=slope.loess, data=rbf)
names(slope.rbf) <- levels(rbf$group)

slope.rbf
$A
           x        y
min 3.310345 20.30981
max 7.724138 18.47787

$B
           x        y
min 3.310345 21.75368
max 7.724138 20.06883

$C
           x        y
min 3.310345 23.53051
max 7.724138 21.47636

$D
           x        y
min 4.413793 25.02747
max 0.000000 26.22230

$E
           x        y
min 4.413793 27.45100
max 0.000000 27.39809
plannapus
  • 18,529
  • 4
  • 72
  • 94
2

I am writing a neural network myself for ultrafast trading. In the beginning I was using Loess or Lowess to fit time series, but what I wanted was smooth derivatives, which Loess doesn't supply. Even, if you implement Loess yourself and use the orthogonal polynomials for each point to compute a derivative you get odd results. There is a reason for this.

A solution to your problem can be found in a paper by Graciela Boente: Robust estimators of high order derivatives of regression functions. The formula is on page 3. The paper is freely available on the internet. Once you have acquired both values and derivatives, you can use this for uniquely defining cubic splines, which will give contiguous derivatives.

I am not familiar with R

Jens Munk
  • 4,627
  • 1
  • 25
  • 40