3

My results of using splines::ns with a least-squares fit varied with no rhyme or reason that I could see, and I think I have traced the problem to the ns function itself.

I have reduced the problem to this:

require(splines)
N <- 0
set.seed(1)
for (i in 1:100) N <- N + identical(ns(1:10,3),ns(1:10,3))
N

My results average about 39, range 34--44 or so, but I expected 100 every time. Why should the results of ns be random? If I substitute bs for ns in both places, I get 100, as expected. My set.seed(1) hopes to demonstrate that the randomness I get is not what R intended.

In a clean session, using RStudio and R version 2.14.2 (2012-02-29), I get 39, 44, 38, etc. Everyone else seems to be getting 100.

Further info:

Substituing splines::ns for ns gives the same results. A clean vanilla session gives the same results. My computer has 8 cores.

The differences, when they happen, are generally or always 2^-54:

Max <- 0
for (i in 1:1000) Max <- max( Max, abs(ns(1:10,3)-ns(1:10,3)) )
c(Max,2^-54)

with result [1] 5.551115e-17 5.551115e-17. This variability causes me big problems down the line, because my optimize(...)$min now varies sometimes even in the first digit, making results not repeatable.

My sessionInfo with a clean vanilla session:

I created what I understand to be known as a clean vanilla session using

> .Last <- function() system("R --vanilla") 
> q("no")

This blows away the session, and when I restart it, I get my clean vanilla session. Then, in answer to Ben Bolker's clarifying question, I did this at the beginning of my clean vanilla session:

> sessionInfo()
R version 2.14.2 (2012-02-29)
Platform: x86_64-pc-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=English_United States.1252 
[2] LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] Revobase_6.1.0   RevoMods_6.1.0   RevoScaleR_3.1-0 lattice_0.20-0  
[5] rpart_3.1-51    

loaded via a namespace (and not attached):
[1] codetools_0.2-8   foreach_1.4.0     grid_2.14.2       iterators_1.0.6  
[5] pkgXMLBuilder_1.0 revoIpe_1.0       tools_2.14.2      XML_3.9-1.1      
> require(splines)
Loading required package: splines
> N <- 0
> set.seed(1)
> for (i in 1:100) N <- N + identical(ns(1:10,3),ns(1:10,3))
> N
[1] 32
Jerry Utah
  • 401
  • 3
  • 7
  • Start a clean session. I get 100 every time. – Tyler Rinker Dec 14 '12 at 02:36
  • My version: `"R version 2.15.1 (2012-06-22)"` and `"2.15.1"` for splines. – Tyler Rinker Dec 14 '12 at 03:18
  • I downloaded R 2.14.2 for windows and still get 100. Start a clean vanilla session [(LINK)](http://stackoverflow.com/questions/12540138/how-can-i-make-my-r-session-vanilla). – Tyler Rinker Dec 14 '12 at 03:34
  • Sorry see the link above, I suspect you have a session file in your working directory that loads automatically. It'll probably be a .Rhistory or .Rdata file. – Tyler Rinker Dec 14 '12 at 03:40
  • Even using > .Last <- function() system("R --vanilla") > q("no") , I still get only 44 hits instead of the expected 100. – Jerry Utah Dec 14 '12 at 03:56
  • Do you always get 39 or does it vary? If you take one such instance where there is a mismatch and you look at the two outputs, can you spot the difference or is it just a rounding error? – flodel Dec 14 '12 at 04:02
  • Can you post the results of: `sessionInfo()`; also try: `splines::ns` in the code. – Tyler Rinker Dec 14 '12 at 04:15
  • Try this command: `for (i in 1:100) { a <- ns(1:10, 3); b<- ns(1:10, 3); if (!identical(a,b)) break }` and examine the values of `a` and `b` when it breaks. – Matthew Lundberg Dec 14 '12 at 04:29
  • I get identical every time on R 2.15.2 here, for what it's worth. – Matthew Lundberg Dec 14 '12 at 04:29
  • Do you have another piece of hardware to try this on? – Matthew Lundberg Dec 14 '12 at 04:37
  • What is your machine in terms of cores/CPUs? Maybe some of the code is run on different CPUs (would point to `qr` as the culprit) leading to tiny diffs. Speculation though... – flodel Dec 14 '12 at 04:39
  • For Mattew's suggestion, I have these results: > for (i in 1:30) if (!identical(a[[i]],b[[i]])) print(i) [1] 8 [1] 9 [1] 16 [1] 19 [1] 26 [1] 28 [1] 29 > i <-8; a[[i]]-b[[i]] [1] -5.551115e-17 > i <-9; a[[i]]-b[[i]] [1] -2.775558e-17 > i <-16; a[[i]]-b[[i]] [1] -5.551115e-17 > i <-26; a[[i]]-b[[i]] [1] 2.775558e-17 > i <-28; a[[i]]-b[[i]] [1] -5.551115e-17 > i <-29; a[[i]]-b[[i]] [1] -5.551115e-17 – Jerry Utah Dec 14 '12 at 04:42
  • 1
    Well, clicking on the nice "help" link next to the AddComment button AND clicking on "learn more..." would seem to be the only logical thing to do. – Carl Witthoft Dec 14 '12 at 13:46
  • FWIW those errors are `2^(c(-54,-55))` – Carl Witthoft Dec 14 '12 at 15:50
  • (1) when you run the 'clean session' and do `sessionInfo()`, do all the loaded/attached packages other than the base packages definitely go away? (2) It does seem you've done your homework here, and something weird is going on, but ... I wonder about the reliability of any numerical operations you're doing such that these kinds of errors are a problem ... ? – Ben Bolker Dec 15 '12 at 01:33
  • Thanks to @Carl for trying to make me smarter about where to click. I sincerely hope it works! Though the "help" link may be great for experienced users, I find it so chocked full of terms and styles and colors undefined to beginners like me that I find it quite poorly written. Currently, I grade it a C- at best. I could do a much better job for a beginning stackoverflow user like me, though it would take a lot of experimenting with alternate possible meanings of the huge collection of vague aspects before I knew the answers to the scores of questions that I would have to figure out first. – Jerry Utah Dec 15 '12 at 01:50
  • For the @Ben questions, I'll check out (1) shortly. I can answer (2) right now. The function I am optimizing has theoretically constant regions, but ns's instability makes them slightly nonconstant, and this forces optimize() to zig in one case, and zag in the other. The minimum functional value (f(x)) is correct, but the optimizing x value can vary greatly. What the calculations down the line need is the x value, not the f(x) value, making calculations totally unreproducable. Am I clear enough? – Jerry Utah Dec 15 '12 at 02:01
  • My answer to question (1) from @Ben is in the last section of my edited original question. – Jerry Utah Dec 15 '12 at 02:20
  • (1) you still seem to have a lot of extra stuff compared to what I'm used to (RevoBase, RevoMods, etc etc) -- are you using Revolution's R version? Or did `--vanilla` not really take effect? (2) if the optimizing x value can vary so much because of trivial numerical differences, you certainly can't expect to get the same answer on a different OS, or when compiling with a different compiler -- maybe not even with a different chip! Again, it is (with some difficulty) possible to imagine situations where that's OK, but it strikes me as very scary indeed. – Ben Bolker Dec 15 '12 at 03:23
  • For @Ben 's latest questions: (1) Yes, you're right, I am indeed trying Revolution R's version of R this month. It has several advantages. I believe that `--vanilla` did take effect, though I can let someone with more knowledge decide for certain. For (2), don't worry too much for me. I am only prototyping now; our final product will involve no optimizations of this kind at all. I am trying to learn now a rule of thumb to allow the final program to work adequately well---it must be very fast for our clients. Therefore, I only need my current optimizations to be stable on my computer. – Jerry Utah Dec 15 '12 at 05:28
  • I have a workaround that apparently generally works: I add 1e-12 to my objective function each iteration, removing almost all possibility of theoretical ties and numerical near-ties. A side benefit is that it makes `optimize` converge in fewer steps. This workaround is not ideal, obviously. – Jerry Utah Dec 15 '12 at 05:47
  • Thanks for continuing to dig, it could be helpful to someone else (remember that you're allowed and indeed encouraged to post an answer to your own question). It is a little spooky if Revolution R's linear algebra libraries are giving apparently non-deterministic results ... I look forward to the full explanation. – Ben Bolker Dec 17 '12 at 01:28
  • Yes, you can post an answer to your own question (there are some restrictions on timing -- you can't answer right away -- and there might be some by reputation? But give it a shot). I don't think you can upvote your own answer, but you can accept it ... http://meta.stackexchange.com/questions/17463/can-i-answer-my-own-questions-even-those-where-i-knew-the-answer-before-asking , http://meta.stackexchange.com/questions/12513/should-i-not-answer-my-own-questions – Ben Bolker Dec 17 '12 at 13:46

1 Answers1

1

This is the answer I got from REvolution Technical Suppoort (posted here with permission):

The problem here is an issue of floating point arithmetic. Revolution R uses the Intel mkl BLAS library for some computations, which differs from what CRAN-R uses and uses this library for the 'ns()' computation. In this case you will also get different results depending on whether you are doing the computation on a Intel-processor based machine or a machine with an AMD chipset.

We do ship the same BLAS and Lapack DLL's that are shipped with CRAN-R, but they are not the default ones used with Revolution R. Customers can revert the installed DLL's if they so choose and prefer, by doing the following:

1). Renaming 'Rblas.dll' to 'Rblas.dll.bak' and 'Rlapack.dll' to 'Rlapack.dll.bak' in the folder 'C:\Revolution\R-Enterprise-6.1\R-2.14.2\bin\x64'.

2). Rename the files 'Rblas.dll.0' and 'Rlapack.dll.0' in this folder to Rblas.dll and Rlpack.dll respectively.

Their suggestion worked perfectly. I have renamed these files back and forth several times, using both RStudio (with Revolution R) and Revolution R's own IDE, always with the same result: The BLAS dlls give me N==40 or so, and the CRAN-R dlls give me N==100.

I will probably go back to BLAS, because in my tests, it is 8-times faster for %*% and 4-times faster for svd(). And that is just using one of my cores (verified by the CPU usage column of the Processes tab of Windows Task Manager).

I am hoping someone with better understanding can write a better answer, for I still don't really understand the full ramifications of this.

Jerry Utah
  • 401
  • 3
  • 7
  • 2
    Yes, but ... how can you get different answers on different repeated runs *when running on the same machine* ?? @flodel suggested above that maybe different runs were getting run on different CPUs etc. within the same machine, but hard to imagine the *chipsets* varying within runs on a single machine ... – Ben Bolker Dec 18 '12 at 20:45
  • 1
    I agree. While RevolutionR's response may be technically correct, if the Intel libraries produce randomly varying results, that would be **BAD** . If there are other readers out there who have R-Enterprise installed, I'd love to see them run this same test (w/ the different libraries). – Carl Witthoft Dec 18 '12 at 21:03
  • I realise the thread is a year old by now, but wanted to add. The same (or worse) behavior is seen in the new Revolution R 7.0 (R 3.0.2), with MKL. I get values which differ substantially. In the problem here if I modify the code to use `all.equal()` rather than `identical()`, the MKL library does OK. My question is here: [link](http://stackoverflow.com/questions/20664377/lme-different-results-each-run-under-revolution-r) – Peter Dec 19 '13 at 13:46