7

Trying out multiple models chapter of #r4ds and ran into an error message at the end:

Error: missing values and NaN's not allowed if 'na.rm' is FALSE In addition: Warning message: In ns(as.numeric(Month), 4) : NAs introduced by coercion

with

ADA_model<- function(ADA_mutiple_model){
   lm(ADA ~ ns(as.numeric(Month), 4), data=ADA_mutiple_model)
}

ADA_mutiple_model <- ADA_mutiple_model %>% 
     mutate(model=map(data,ADA_model)) 

as the code I used that creates the error.

See mod3 below to see what the function looks like

enter image description here

Zheyuan Li
  • 71,365
  • 17
  • 180
  • 248
user12081
  • 147
  • 1
  • 3
  • 9
  • 2
    You can't use `lm` if there are `NA` in your data. Therefore, the error message is straightforward : add the option `na.rm=TRUE` in `lm`. I suggest you look at your data as well to understand what is wrong with your data. – jgadoury Aug 15 '16 at 19:23
  • @jgadoury I don't think `lm` has a `na.rm` argument. Could you mean the `na.action` argument? – aosmith Aug 15 '16 at 20:52
  • The argument is `na.omit=TRUE`, my mistake – jgadoury Aug 15 '16 at 21:30
  • Yeah what the hell am I talking about, it's `na.action=na.omit`. That's what happens when I try to sound smart without double-checking my stuff – jgadoury Aug 16 '16 at 01:17
  • That is the right argument, but the problem was on my function – user12081 Aug 16 '16 at 17:55

1 Answers1

3

Your problem has nothing to do with the use of lm, but inside splines::ns when generating B-spline basis for natural cubic splines. Very likely your Month is a character variable, and you can not use as.numeric for coercing.


I just checked your attached figure. The x-axis in the plots verifies what I guessed. You need to use 1:12 for Month, not "JAN", "FEB", etc.

Zheyuan Li
  • 71,365
  • 17
  • 180
  • 248
  • Thanks, that was the problem. I am wondering why that was because I was able to this earlier on the script – user12081 Aug 16 '16 at 17:50
  • ```mod1 <- lm(median_ADA ~ ns(as.numeric(Month), 2), data = Summary_model) mod2 <- lm(median_ADA ~ ns(as.numeric(Month), 3), data = Summary_model) mod3 <- lm(median_ADA ~ ns(as.numeric(Month), 4), data = Summary_model) mod4 <- lm(median_ADA ~ ns(as.numeric(Month), 5), data = Summary_model) grid<-Summary_model %>% data_grid(Month=seq_range(as.numeric(Month), n=50, expand=0.1)) %>% gather_predictions(mod1,mod2,mod3,mod4, .pred="median_ADA") ggplot(Summary_model, aes(Month,median_ADA)) + geom_point()+ geom_line(data = grid, colour="red")+ facet_wrap(~ model) ``` – user12081 Aug 16 '16 at 17:51