0

I would like to fit the data and predict y values for wider x range.

Lets assume I have 'iris' data set and use following data for prediction from this post

 library(dplyr)
 cc <- iris %>%
  group_by(Species) %>%
  do({
    mod <- nlsLM(Sepal.Length ~ k*Sepal.Width/2+U, start=c(k=10,U=5), data = ., trace=F, control = nls.lm.control(maxiter=100))
    pred <- predict(mod, newdata =.["Sepal.Width"])
    data.frame(., pred)
  })

This is the fitting plot

enter image description here

I want to fit this data with wider Sepal width range such that

new.range<- data.frame(x=seq(2,10,length.out=20))

and modify the script

 pred <- predict(mod, newdata =new.range)

TO plot new.range fitting

library(ggplot2)

ggplot(cc,aes(y=Sepal.Length,x=Sepal.Width ,col=factor(Species)))+
  geom_point()+
  facet_wrap(~Species)+
  geom_line(aes(x=new.range,y=pred),size=1)

Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE, : arguments imply differing number of rows: 20, 150

I cannot understand why getting this error. I suppose that pred is calculated from new.range so they should have the same length?

similar posts

using-predict-in-nls

trouble-with-predict-function-in-r

predict-maybe-im-not-understanding-it?

Glorfindel
  • 21,988
  • 13
  • 81
  • 109
Alexander
  • 4,527
  • 5
  • 51
  • 98

1 Answers1

2

This is something that achieves what you want. The cause for your original problem is that in your regression, the predictor's name is Sepal.width not x, and your prediction doesn't use your new.range at all, so you have to do something like new.range<- data.frame(Sepal.Width=seq(2,10,length.out=50)) to make predictions on your new.range.

Another problem is that you have to make the new.range's length to be 50, so that the pred and new.range fit in the original data.frame.

And then you can draw the plot you want, note that the new.range becomes Sepal.Width.1.

library(dplyr)
cc <- iris %>%
    group_by(Species) %>%
    do({
        mod <- nlsLM(Sepal.Length ~ k*Sepal.Width/2+U, start=c(k=10,U=5), data = ., trace=F, control = nls.lm.control(maxiter=100))
        new.range<- data.frame(Sepal.Width=seq(2,10,length.out=50))
        pred <- predict(mod, newdata =new.range)
        # pred <- predict(mod, newdata =.["Sepal.Width"])
        data.frame(., new.range, pred)

    })

library(ggplot2)

ggplot(cc,aes(y=Sepal.Length,x=Sepal.Width ,col=factor(Species)))+
    geom_point()+
    facet_wrap(~Species)+
    geom_line(aes(x=Sepal.Width.1,y=pred),size=1)
Consistency
  • 2,884
  • 15
  • 23
  • Thanks for great answer. OTH, In my real data the length of each group is different! In the `iris` data set each species has 50 rows. Thats fine when we set length.out=50. However, in my real data there are some groups with the row size 100 or 160. So I cannot use same `length.out` parameter for them. Do you have any suggestion for that ? – Alexander Jun 30 '17 at 02:59
  • 1
    @Alexander You are welcome. If the row size is different, you can replace 50 by `nrow(.)`, which should work. – Consistency Jun 30 '17 at 03:11