lme warning message because of random effects

Question

I have a data frame with 5 variables: Lot / Wafer / Serial Number / Voltage / Amplification. In this data frame there are 1020 subsets grouped by Serial_number. Each subset has a certain number of measurement data points (Amplification against voltage).

I fit the data with

summary(fit2.lme <- lmer(log(log(Amplification)) ~ poly(Voltage, 3) + (poly(Voltage, 1) | Serial_number),
+  data = APD))

which yields:

Linear mixed model fit by REML ['lmerMod']
Formula: log(log(Amplification)) ~ poly(Voltage, 3) + (poly(Voltage, 1) |      Serial_number)
   Data: APD

REML criterion at convergence: 35286.1

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-20.7724  -0.2438  -0.1297   0.2434  13.2663 

Random effects:
 Groups        Name             Variance  Std.Dev. Corr 
 Serial_number (Intercept)      1.439e-02  0.1199       
               poly(Voltage, 1) 2.042e+03 45.1908  -0.76
 Residual                       8.701e-02  0.2950       
Number of obs: 76219, groups:  Serial_number, 1020

Fixed effects:
                    Estimate Std. Error t value
(Intercept)        5.944e-02  3.914e-03    15.2
poly(Voltage, 3)1  5.862e+02  1.449e+00   404.5
poly(Voltage, 3)2 -1.714e+02  3.086e-01  -555.4
poly(Voltage, 3)3  4.627e+01  3.067e-01   150.8

Correlation of Fixed Effects:
            (Intr) p(V,3)1 p(V,3)2
ply(Vlt,3)1 -0.713                
ply(Vlt,3)2  0.015 -0.004         
ply(Vlt,3)3  0.004  0.012   0.018

and when I add a higher polynomial in the random effects I get a warning:

>  summary(fit3.lme <- lmer(log(log(Amplification)) ~ poly(Voltage, 3) + (poly(Voltage, 2) | Serial_number),
+   data = APD))
Linear mixed model fit by REML ['lmerMod']
Formula: log(log(Amplification)) ~ poly(Voltage, 3) + (poly(Voltage, 2) |      Serial_number)
   Data: APD

REML criterion at convergence: 16285.9

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-20.5042  -0.2393  -0.0697   0.3165  13.9634 

Random effects:
 Groups        Name              Variance  Std.Dev. Corr       
 Serial_number (Intercept)       1.584e-02  0.1259             
               poly(Voltage, 2)1 1.777e+03 42.1536  -0.67      
               poly(Voltage, 2)2 1.579e+03 39.7365   0.87 -0.95
 Residual                        6.679e-02  0.2584             
Number of obs: 76219, groups:  Serial_number, 1020

Fixed effects:
                    Estimate Std. Error t value
(Intercept)        5.858e-02  4.062e-03    14.4
poly(Voltage, 3)1  5.938e+02  1.351e+00   439.5
poly(Voltage, 3)2 -1.744e+02  1.276e+00  -136.7
poly(Voltage, 3)3  5.719e+01  2.842e-01   201.2

Correlation of Fixed Effects:
            (Intr) p(V,3)1 p(V,3)2
ply(Vlt,3)1 -0.641                
ply(Vlt,3)2  0.825 -0.906         
ply(Vlt,3)3 -0.001  0.030  -0.004 
convergence code: 1
Model failed to converge with max|grad| = 2.22294 (tol = 0.002, component 1)
Model is nearly unidentifiable: large eigenvalue ratio
 - Rescale variables?

Warning messages:
1: In optwrap(optimizer, devfun, getStart(start, rho$lower, rho$pp),  :
  convergence code 1 from bobyqa: bobyqa -- maximum number of function evaluations exceeded
2: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  :
  Model failed to converge with max|grad| = 2.22294 (tol = 0.002, component 1)
3: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  :
  Model is nearly unidentifiable: large eigenvalue ratio
 - Rescale variables?

The data is as following (can provide the complete data of course, if desired). It includes 77479 observables of 6 variables:

'data.frame':   77479 obs. of  6 variables:
 $ Serial_number: num  9.12e+08 9.12e+08 9.12e+08 9.12e+08 9.12e+08 ...
 $ Lot          : int  9 9 9 9 9 9 9 9 9 9 ...
 $ Wafer        : int  912 912 912 912 912 912 912 912 912 912 ...
 $ Amplification: num  1 1 1.01 1.01 1.01 ...
 $ Voltage      : num  25 30 34.9 44.9 49.9 ...

and the data itself looks like:

Serial_number    Lot    Wafer    Amplification    Voltage
1    912009913    9    912    1.00252    24.9681
2    912009913    9    912    1.00452    29.9591
3    912009913    9    912    1.00537    34.9494
(...)
73    912009913    9    912    918.112    375.9850
74    912009913    9    912    1083.74    377.9990
75    912009897    9    912    1.00324    19.9895
76    912009897    9    912    1.00449    29.9777
(...)

What does the warnings mean? According to the anova the fit3.lme model describes the data better:

> anova(fit3.lme, fit2.lme)
refitting model(s) with ML (instead of REML)
Data: APD
Models:
fit2.lme: log(log(Amplification)) ~ poly(Voltage, 3) + (poly(Voltage, 1) | 
fit2.lme:     Serial_number)
fit3.lme: log(log(Amplification)) ~ poly(Voltage, 3) + (poly(Voltage, 2) | 
fit3.lme:     Serial_number)
         Df   AIC   BIC   logLik deviance Chisq Chi Df Pr(>Chisq)    
fit2.lme  8 35294 35368 -17638.9    35278                            
fit3.lme 11 16264 16366  -8121.1    16242 19036      3  < 2.2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Warning message:
In optwrap(optimizer, devfun, x@theta, lower = x@lower, calc.derivs = TRUE,  :
  convergence code 1 from bobyqa: bobyqa -- maximum number of function evaluations exceeded

Therefore I would like to use that model but I stuck in the warning.

update:

center and scale predictors

ss.CS<- transform(APD, Voltage=scale(Voltage)) 
> fit31.lme<- update(fit3.lme, data=ss.CS) 
Error in poly(dots[[i]], degree, raw = raw, simple = raw) : 
'degree' must be less than number of unique points

Also for the other variable (don't know for which it makes sense)

> ss.CS<- transform(APD, Amplitude=scale(Amplitude))
Error in scale(Amplitude) : object 'Amplitude' not found
> ss.CS<- transform(APD, Amplification=scale(Amplification))
> fit31.lme<- update(fit3.lme, data=ss.CS)
Warning messages:
1: In log(Amplification) : NaNs produced
2: In log(log(Amplification)) : NaNs produced
3: In log(Amplification) : NaNs produced
4: In log(log(Amplification)) : NaNs produced
5: In log(Amplification) : NaNs produced
6: In log(log(Amplification)) : NaNs produced
7: Some predictor variables are on very different scales: consider rescaling

check singularity

> diag.vals<- getME(fit3.lme, "theta")[getME(fit3.lme, "lower")==0]
> any(diag.vals<- 1e-6)
[1] TRUE
Warning message:
In any(diag.vals <- 1e-06) : coercing argument of type 'double' to logical

compute gradient and Hessian with Richardson extrapolation

> devfun<- update(fit3.lme, devFunOnly=TRUE)
>  if(isLMM(fit3.lme)){
+     pars<- getME(fit3.lme, "theta")
+     } else {
+             pars<- getME(fit3.lme, c("theta", "fixef"))
+            }
> if (require("numDeriv")) {
+    cat("hess:\n"); print(hess <- hessian(devfun, unlist(pars)))
+    cat("grad:\n"); print(grad <- grad(devfun, unlist(pars)))
+    cat("scaled gradient:\n")
+    print(scgrad <- solve(chol(hess), grad))
+  }
Loading required package: numDeriv
hess:
             [,1]          [,2]          [,3]         [,4]         [,5]         [,6]
[1,] 39137.840764 -56.189442277 -1.348127e+02  3.789427141 25.456612941 -3.806942811
[2,]   -56.189442   0.508077776  6.283795e-01 -0.068882737 -0.056159369  0.003228274
[3,]  -134.812704   0.628379462  1.061584e+00 -0.079620905 -0.152816413  0.007457255
[4,]     3.789427  -0.068882737 -7.962090e-02  0.516054976  0.534346634  0.001513168
[5,]    25.456613  -0.056159369 -1.528164e-01  0.534346634  0.901191745 -0.002344407
[6,]    -3.806943   0.003228274  7.457255e-03  0.001513168 -0.002344407  0.179283416
grad:
[1] -22.9114985   2.2229416  -0.2959238   0.6790044  -0.2343368  -0.4020556
scaled gradient:
[1] -0.1123624  4.4764140 -0.8777938  1.3980054 -0.4223921 -0.9508207
> fit3.lme@optinfo$derivs
$gradient
[1] -22.9118920   2.2229424  -0.2959264   0.6790037  -0.2343360  -0.4020605

$Hessian
             [,1]         [,2]          [,3]        [,4]        [,5]        [,6]
[1,] 39137.915527 -56.20745850 -134.87176514  3.74780273 25.47540283 -3.79016113
[2,]   -56.207458   0.44262695    0.61462402 -0.04736328 -0.06585693  0.02130127
[3,]  -134.871765   0.61462402    1.04296875 -0.10467529 -0.23223877  0.05438232
[4,]     3.747803  -0.04736328   -0.10467529  0.52026367  0.50909424 -0.02130127
[5,]    25.475403  -0.06585693   -0.23223877  0.50909424  0.68481445 -0.02044678
[6,]    -3.790161   0.02130127    0.05438232 -0.02130127 -0.02044678  0.07617188

4. restart the fit from the original value (or a slightly perturbed value):

> fit3.lme.restart <- update(fit3.lme, start=pars)
> summary(fit3.lme.restart)
Linear mixed model fit by REML ['lmerMod']
Formula: log(log(Amplification)) ~ poly(Voltage, 3) + (poly(Voltage, 2) |      Serial_number)
   Data: APD

REML criterion at convergence: 16250.3

Scaled residuals: 
     Min       1Q   Median       3Q      Max 
-20.4868  -0.2404  -0.0697   0.3166  13.9464 

Random effects:
 Groups        Name              Variance  Std.Dev. Corr       
 Serial_number (Intercept)       1.823e-02  0.1350             
               poly(Voltage, 2)1 2.124e+03 46.0903  -0.77      
               poly(Voltage, 2)2 1.937e+03 44.0164   0.90 -0.96
 Residual                        6.668e-02  0.2582             
Number of obs: 76219, groups:  Serial_number, 1020

Fixed effects:
                    Estimate Std. Error t value
(Intercept)          0.05823    0.00434    13.4
poly(Voltage, 3)1  593.83396    1.47201   403.4
poly(Voltage, 3)2 -174.61257    1.40711  -124.1
poly(Voltage, 3)3   57.15901    0.28427   201.1

Correlation of Fixed Effects:
            (Intr) p(V,3)1 p(V,3)2
ply(Vlt,3)1 -0.735                
ply(Vlt,3)2  0.868 -0.927         
ply(Vlt,3)3 -0.001  0.028  -0.003

5. try all available optimizers

> source(system.file("utils", "allFit.R", package="lme4"))
Loading required package: optimx
Loading required package: dfoptim
> fit3.lme.all <- allFit(fit3.lme)
bobyqa : [OK]
Nelder_Mead : [OK]
nlminbw : [OK]
nmkbw : [OK]
optimx.L-BFGS-B : [OK]
nloptwrap.NLOPT_LN_NELDERMEAD : [OK]
nloptwrap.NLOPT_LN_BOBYQA : [OK]
Warning messages:
1: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  :
  unable to evaluate scaled gradient
2: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  :
  Model failed to converge: degenerate  Hessian with 1 negative eigenvalues
3: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  :
  unable to evaluate scaled gradient
4: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  :
  Model failed to converge: degenerate  Hessian with 1 negative eigenvalues
5: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  :
  unable to evaluate scaled gradient
6: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  :
  Model failed to converge: degenerate  Hessian with 1 negative eigenvalues
7: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  :
  unable to evaluate scaled gradient
8: In checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv,  :
  Model failed to converge: degenerate  Hessian with 1 negative eigenvalues

 ss <- summary(fit3.lme.all)

> ss$ fixef               ## extract fixed effects
                              (Intercept) poly(Voltage, 3)1 poly(Voltage, 3)2 poly(Voltage, 3)3
bobyqa                         0.05822789          593.8340         -174.6126          57.15901
Nelder_Mead                    0.05822787          593.8340         -174.6126          57.15902
nlminbw                        0.05822787          593.8340         -174.6126          57.15902
nmkbw                          0.05841966          593.7804         -174.4999          57.17107
optimx.L-BFGS-B                0.05822845          593.8336         -174.6116          57.16183
nloptwrap.NLOPT_LN_NELDERMEAD  0.05823870          593.8330         -174.6076          57.16039
nloptwrap.NLOPT_LN_BOBYQA      0.05823870          593.8330         -174.6076          57.16039

> ss$ llik                ## log-likelihoods
                       bobyqa                   Nelder_Mead                       nlminbw                         nmkbw               optimx.L-BFGS-B 
                    -8125.125                     -8125.125                     -8125.125                     -8129.827                     -8125.204 
nloptwrap.NLOPT_LN_NELDERMEAD     nloptwrap.NLOPT_LN_BOBYQA 
                    -8125.137    

> ss$ sdcor               ## SDs and correlations
                              Serial_number.(Intercept) Serial_number.poly(Voltage, 2)1.(Intercept) Serial_number.poly(Voltage, 2)2.(Intercept)
bobyqa                                        0.1350049                                    46.09013                                    44.01631
Nelder_Mead                                   0.1350064                                    46.09104                                    44.01705
nlminbw                                       0.1350065                                    46.09106                                    44.01707
nmkbw                                         0.1347214                                    46.11336                                    43.81219
optimx.L-BFGS-B                               0.1356576                                    46.32849                                    44.27171
nloptwrap.NLOPT_LN_NELDERMEAD                 0.1347638                                    45.97995                                    43.91054
nloptwrap.NLOPT_LN_BOBYQA                     0.1347638                                    45.97995                                    43.91054
                              Serial_number.poly(Voltage, 2)1 Serial_number.poly(Voltage, 2)2.poly(Voltage, 2)1 Serial_number.poly(Voltage, 2)2
bobyqa                                             -0.7665898                                         0.9042387                      -0.9608662
Nelder_Mead                                        -0.7665981                                         0.9042424                      -0.9608680
nlminbw                                            -0.7665980                                         0.9042425                      -0.9608680
nmkbw                                              -0.7658163                                         0.9076551                      -0.9649999
optimx.L-BFGS-B                                    -0.7713801                                         0.9067725                      -0.9617129
nloptwrap.NLOPT_LN_NELDERMEAD                      -0.7645748                                         0.9034336                      -0.9606020
nloptwrap.NLOPT_LN_BOBYQA                          -0.7645748                                         0.9034336                      -0.9606020
                                  sigma
bobyqa                        0.2582156
Nelder_Mead                   0.2582156
nlminbw                       0.2582156
nmkbw                         0.2584714
optimx.L-BFGS-B               0.2582244
nloptwrap.NLOPT_LN_NELDERMEAD 0.2582207
nloptwrap.NLOPT_LN_BOBYQA     0.2582207

> ss$ theta               ## Cholesky factors
                              Serial_number.(Intercept) Serial_number.poly(Voltage, 2)1.(Intercept) Serial_number.poly(Voltage, 2)2.(Intercept)
bobyqa                                        0.5228377                                   -136.8323                                    154.1396
Nelder_Mead                                   0.5228438                                   -136.8364                                    154.1428
nlminbw                                       0.5228439                                   -136.8365                                    154.1429
nmkbw                                         0.5212237                                   -136.6278                                    153.8521
optimx.L-BFGS-B                               0.5253478                                   -138.3947                                    155.4631
nloptwrap.NLOPT_LN_NELDERMEAD                 0.5218936                                   -136.1436                                    153.6293
nloptwrap.NLOPT_LN_BOBYQA                     0.5218936                                   -136.1436                                    153.6293
                              Serial_number.poly(Voltage, 2)1 Serial_number.poly(Voltage, 2)2.poly(Voltage, 2)1 Serial_number.poly(Voltage, 2)2
bobyqa                                               114.6181                                         -71.06063                    1.578418e+01
Nelder_Mead                                          114.6186                                         -71.06067                    1.578354e+01
nlminbw                                              114.6187                                         -71.06067                    1.578351e+01
nmkbw                                                114.7270                                         -71.14411                    3.440466e-42
optimx.L-BFGS-B                                      114.1731                                         -70.65227                    1.527854e+01
nloptwrap.NLOPT_LN_NELDERMEAD                        114.7688                                         -71.19817                    1.568481e+01
nloptwrap.NLOPT_LN_BOBYQA                            114.7688                                         -71.19817                    1.568481e+01

> ss$ which.OK            ## which fits worked
                       bobyqa                   Nelder_Mead                       nlminbw                         nmkbw               optimx.L-BFGS-B 
                         TRUE                          TRUE                          TRUE                          TRUE                          TRUE 
nloptwrap.NLOPT_LN_NELDERMEAD     nloptwrap.NLOPT_LN_BOBYQA 
                         TRUE                          TRUE

Due to users's coment I add the following:

> bam(log(log(Amplification)) ~ s(Voltage) + s(Serial_number, bs="re") + s(Voltage, Serial_number, bs="re"), data=APD, discrete = TRUE)

Family: gaussian 
Link function: identity 

Formula:
log(log(Amplification)) ~ s(Voltage) + s(Serial_number, bs = "re") + 
    s(Voltage, Serial_number, bs = "re")

Estimated degrees of freedom:
  9 993 987  total = 1990.18 

fREML score: -226.8182   


> summary(bam(log(log(Amplification)) ~ s(Voltage) + s(Serial_number, bs="re") + s(Voltage, Serial_number, bs="re"), data=APD, discrete = TRUE))

Family: gaussian 
Link function: identity 

Formula:
log(log(Amplification)) ~ s(Voltage) + s(Serial_number, bs = "re") + 
    s(Voltage, Serial_number, bs = "re")

Parametric coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  0.11500    0.01896   6.066 1.31e-09 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Approximate significance of smooth terms:
                             edf Ref.df     F p-value    
s(Voltage)                 8.998      9 89229  <2e-16 ***
s(Serial_number)         993.441   1019 55241  <2e-16 ***
s(Voltage,Serial_number) 986.741   1019 36278  <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

R-sq.(adj) =  0.989   Deviance explained =   99%
fREML = -226.82  Scale est. = 0.051396  n = 76219

On https://uploadfiles.io/n7h9z you can download the r script and data.

Update

(some plots concerning the gam model):

Here are all measurement data points transformed double-logarithmically:

The physical behaviour of the device is at least exponentially and even almost double-exponentially (as I found in a book). By transforming them double-logarithmically the almost behave then linearly. A polynomial of degree described the data already well but a polynomial degree of three did it better, though. I guess this can also be seen on the plot why that.

Some additional plots (I'm not used to GAMs so I just add them):

You can download the data from the link: https://uploadfiles.io/n7h9z

Including a [minimal reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) in your question will increase your chances of getting an answer. — Samuel, Oct 13 '17 at 11:11
see `?convergence` for things you can try. If you are not sworn to this parametric model, you could look at fitting mixed-effect gam models - see mgcv::gamm, ot gamm4 package. — user20650, Oct 13 '17 at 11:20
@jsb I uploaded the r script and data on https://ufile.io/n7h9z — Ben, Oct 13 '17 at 11:39
If you google the warning message, there is some good discussion over at CrossValidated as well. — Samuel, Oct 13 '17 at 11:50
@jsb google provides too much results, can you please specify? — Ben, Oct 13 '17 at 11:52
I added all the results adviced in ?convergence. This is quite a lot of information, can you please help me to figure out what I need? — Ben, Oct 13 '17 at 13:20
Re the scaling of Voltage - as you are using `poly` to transform the variable, I don't *think* it will make any difference. You cant center `Amplitude`, and then `log` transform as you notice (as log of a negative number of undefined). Also quick comment on your first model that fit: the variance is very high, so even though its converged might be worth a look. Sometimes complex models dont fit if they cant be supported by the data. — user20650, Oct 13 '17 at 13:36
re the poly error: again i dont know but can reproduce with `x = rnorm(100) ; poly(scale(x), 1)`, but explicitley specifying the degree seems to work `poly(scale(x), degree=1)` So ??? — user20650, Oct 13 '17 at 13:39
pps you could try simplifying the models by removing correlation in the re's first go try `(poly(Voltage, 2) || Serial_number)` I'm not sure how to easily remove the correlation between the poly slopes using this syntax — user20650, Oct 13 '17 at 13:54
@user At first thanks a lot! Can you please describe how I apply "poly(scale(x)), (..)"? — Ben, Oct 13 '17 at 13:57
I've not really helped - hopefully someone with mixed-effects experience will offer guidance. My comment on poly(scale(x)), was just to illustrate that i can get similar errors (for which i dont understand) as your section *center and scale predictors*, and by setting `degree=...` explicitely seems to remove the error (again i dont understand why) .Though i dont think rescaling will help due to your use of poly. — user20650, Oct 13 '17 at 14:02
Ah, ok. Thanks anyway! I also tried the bam you adviced but though I installed the package and just enter bam(Amplification~Voltage, data=APD) I receive the error that the function cannot be found.. but I can load the package and also data frames out of it.. will try to figure that out. — Ben, Oct 13 '17 at 14:04
You shouldn’t need to install `mgcv` - it comes with `R`, but it sounds like you have not used `library(mgcv)` (I removed my previous comment on using gams so not to confuse... but I'll repost: if not theoretical reason for your choice of polynomial fitting a gam may be preferred. you could try `library(mgcv) ; bam(log(log(Amplification)) ~ s(Voltage) + s(Serial_number, bs="re") + s(Voltage, Serial_number, bs="re"), data=APD, discrete = TRUE)`) — user20650, Oct 13 '17 at 14:13
@user20650 Indeed, I can run it now and added the results in the question. — Ben, Oct 13 '17 at 15:28
okay good. But please make sure you are happy with the models, and how they have been specified. I may be giving bad stats advice! — user20650, Oct 13 '17 at 16:42
@user20650 Yes, I first have to check how gams represent my situation. All I can say is that lmer are often used for my case. I then just checked which order of the polynomials in the random effects I need (via anovas) and the higher the better. But since the optimizer are having problems(?) I don't know if I can trust the anovas. To be honest at the moment I'm feeling very lost: I have absolutely no idea in which direction I should move on. What I do not understand according to the ?convergence is that I ran all optimizers and all fits seem to work... but still the warning appeared.. ..? — Ben, Oct 13 '17 at 17:41
It may be worth asking a thinned down question at https://stats.stackexchange.com/questions to get some stats advice (or maybe two questions 1) convergence 2) general modelling advice to your data ie gams?). My 2c, fwiw, is that fitting and selecting polynomials with increasing degree is rarely a good idea, and would be unlikely to generalise - gams reduce some of these issues. — user20650, Oct 13 '17 at 17:52
Ok, thanks for your support, will do this. Just to respond the last issue: I increased the order of polynomials and the anovas showed that they become better till a degree of three. Then I added random effects and again the anovas showed that raneffs are better up to a polynomial of third degree. Will figure out gams, had already a look at them bit didn't get really behind their idea up to now. Took me even some time to understand R, lmer and poly().. :) and now optimizers and I still feel line I do my very first steps. — Ben, Oct 13 '17 at 18:59
Can you show the `plot` output for the `bam` model? Also, residuals vs fitted values. I'm very skeptical of polynomials of an order higher than two in regression models if there is no scientific basis for such a polynomial relationship (or is there such a theory?). The additive model seems to fit excellently. Also, do you aim for inference or prediction? — Roland, Oct 18 '17 at 07:51
@Roland Hi Roland, at first thank you for your answer! I added the plots (hopefully they are meaningful since I'm not really into GAMs)? Just want to add something to the polynomials of third degree: According to the Anovas they describe the data better and also the plot (log(log(Amplification)) ~ Voltage) seems to show that because two curvatures are visible. — Ben, Oct 18 '17 at 08:26
The GAM plots look excellent. The only improvement I might consider is including measurement uncertainties as weights if you have them for each value of amplification. I would use the GAM for whatever you want to do. If you need a parametric model a third-degree polynomial should be fine, but you'd probably need a Bayesian mixed model if you want to include random slopes for the higher order polynomial terms. I haven't needed to use these myself yet. — Roland, Oct 18 '17 at 08:41
@Roland Can you tell how I add these uncertainties? To be honest, I thought all the models include them on their own. Didn't hear of bayesian mixed models before, will try them out. If I understand you right I can either use this GAM model with little modifications or use a bayesian mixed model? — Ben, Oct 18 '17 at 08:49
See the `weights` parameter. Observations with smaller uncertainties get more weight. Typically you'd weight with something like 1/sigma or 1/sigma^2. Can't help you much with Bayesian modelling. I'm only suggesting it because it can help with your convergence problems. However, you'd need of course suitable priors ... — Roland, Oct 18 '17 at 08:53
Note that your transformation has impact on the variances. You clearly see larger variances at low values in your plots. — Roland, Oct 18 '17 at 08:55
I just ran a few blmer models. For polynomials of three degree in the fixed effects I get no error but immediately when I add a polynomial in the random effects there are again the same convergence warnings. And also the anova states the one with the convergence warnings to be better. — Ben, Oct 18 '17 at 09:02
Ah, yes, of course.. well, thanks a lot so far! I will now go more deeper with some gam and bayesian models. As a decision: Can you finally please tell me how to choose the better model (probably gam or bayesian then)? — Ben, Oct 18 '17 at 09:23
@Roland Before I forget: You asked if I aim for inference or prediction. Probably it is connected to both: With the help of the parameters extracted from the model I want to apply an inverse regression. — Ben, Oct 18 '17 at 10:02
@Roland Cannot say yes or no. Finally we set a specific amplification, which is the same for all devices, and we need the corresponding voltage. Unfortunately we do not have measurement data points at this certain amplification therefore an inverse regression is needed. The number of devices is in total about 30,000. At the moment I focus at only about thousand devices because they will be installed in such doses. — Ben, Oct 18 '17 at 10:12
PS: The effort with these mixed models is that typical standard literature fits do not fit very well and when I apply regular polynomials the deviance of the fits is larger than we need to set the electronics precisely. — Ben, Oct 18 '17 at 10:15
Divide your data in a training and a validation dataset and validate your final model. — Roland, Oct 18 '17 at 10:15
On which basis do I validate? Apart from that: Can you post your help as an answer so I can distribute the bounties then (in case you care)? — Ben, Oct 18 '17 at 14:47
The range of values provided by `poly()` is sensitive to the number of observations. For example, `apply(poly(rep(1:100, 1), 3), 2, range)` vs `apply(poly(rep(1:100, 100), 3), 2, range)`. 100 times as many observations of the same values but range of polynomial values shrinks greatly. That might contribute to the rescaling variables issue. I say a bit more about it [here](https://github.com/tjmahr/polypoly/blob/master/README.md#rescaling-a-matrix) — TJ Mahr, Oct 19 '17 at 13:52
@TJMahr Hi, thanks a lot for the interesting issue and package. Worked it all through and ran all the stuff but I still have a question (please don't misunderstand me): How does that solve my certain problem respectively how do I apply polypoly to it? I mean, lmer uses the polynomials on its own or can/shall I somehow modify it? — Ben, Oct 19 '17 at 16:53
I just want to make you aware that with 76219 observations, the polynomial matrix could have a very small scale. You might need to manually compute the polynomials and rescale them, which might help with the "rescale variables?" message. — TJ Mahr, Oct 19 '17 at 18:01
What do I do with the manually calculated polynomials? I guess this I already have done with the help of your package(?). But finally I need a lmer model, resoecticely the fixed and random coefficients. Sorry, maybe I'm not able to figure it out, I'm not thaaat fit with statistics. — Ben, Oct 19 '17 at 19:44

score 1 · Accepted Answer · answered Oct 25 '17 at 16:03

The convergence warnings disappeared when I removed all data points <2. I stumbled over this by coincidence..

Probably this is somehow connected to the issue that for each subset within the range from 0 to about 50 all data points are almost exactly the same (and have values of about ~1).