12

The .Internal(La_rs(x,FALSE)) call inside of the eigen function used within fields:::Krig.engine.default causes my R console[1] to crash after it gets triggered during a huge script.

I am sure that that's the line that's causing the detonation. however, the same line does not crash on a fresh session using:

x <- structure(c(0.00251355321405019, -0.000589785531216647, -0.000172411748626129, -0.000589785531217227, 0.000897505637785858, -0.000714600035538855, -0.000172411748626269, -0.000714600035538766, 0.00123946691634644), .Dim = c(3L, 3L))
.Internal(La_rs(x,FALSE))

you can reproduce this console crash with the following three lines (takes about ten minutes):

# install.packages( c("MonetDB.R", "MonetDBLite" , "survey" , "SAScii" , "descr" , "downloader" , "digest" , "sas7bdat" , "R.utils" ,"survey","ggplot2","scales","mapproj","sqldf","maptools","raster","rgeos","stringr","plyr","mgcv","spatstat","rgeos") , repos=c("http://dev.monetdb.org/Assets/R/", "http://cran.rstudio.com/"))
# path.to.7z <- "7za"       # macintosh/unix users need to specify 7z
#  setwd("C:/My Directory/")
# warning: some large downloads
downloader::source_url( "https://raw.githubusercontent.com/davidbrae/swmap/8eecde1683efab65a7e27eb7c92e7967a98dc639/how%20to%20map%20the%20american%20community%20survey.R" , prompt = FALSE )

sorry the example isn't more minimal, the crash disappeared when i removed different things..

february 22nd 2016 edit: even worse, when i try a script intended to trigger the crash on its own, it does not die!

downloader::source_url("https://gist.githubusercontent.com/ajdamico/0c256ed3a77d77eecfd6/raw/ce0570effd37c6384f2e27f1b38335078adcb49d/La_rs_bughunt.R",echo=T,prompt=F)

thanks!

[1] R version 3.2.3 (2015-12-10) Platform: x86_64-w64-mingw32/x64 (64-bit)

if i run the whole script at once, R crashes without any info in Rterm.exe. but if i break the script up into two parts, R gives me this error:

> x
              [,1]          [,2]          [,3]
[1,]  0.0025135532 -0.0005897855 -0.0001724117
[2,] -0.0005897855  0.0008975056 -0.0007146000
[3,] -0.0001724117 -0.0007146000  0.0012394669
> .Internal(La_rs(x,TRUE))
Error: 'a' must be a complex matrix

a bit more debugging info: it looks like the .Internal() function La_rs has been destroyed somehow?

> debug::mtrace(.Internal(La_rs(x,TRUE)))
Error in debug::mtrace(.Internal(La_rs(x, TRUE))) : 
  Dunno wot to do with .Internal(La_rs(x, TRUE))
> x
              [,1]          [,2]          [,3]
[1,]  0.0025135532 -0.0005897855 -0.0001724117
[2,] -0.0005897855  0.0008975056 -0.0007146000
[3,] -0.0001724117 -0.0007146000  0.0012394669
> class(x)
[1] "matrix"
> .Internal(La_rs(x,FALSE))
Error: 'a' must be a complex matrix
> .Internal(La_rs(x,TRUE))
Error: 'a' must be a complex matrix
> .Internal(La_rs(1,TRUE))
Error: 'a' must be a complex matrix
> .Internal(La_rs(matrix(1,2,3,4),TRUE))
Error: 'a' must be a complex matrix

february 21 2016 update: i was able to reproduce this error (without R dying) on a second windows script. here is the permanent link

# install.packages( c( 'fields' , 'maps' , 'ggplot2' , 'raster' , 'sqldf' , 'rgeos' , 'rgdal' , 'sp' , 'digest' , 'ff' , 'descr' , 'SAScii' , 'stringr' , 'R.utils' , 'R.oo' , 'RCurl' , 'MonetDBLite' , 'MonetDB.R' , 'survey' , 'downloader' ) , repos=c("http://dev.monetdb.org/Assets/R/", "http://cran.rstudio.com/"))
# setwd( "S:/temp/PNAD" )
# warning: some large downloads
downloader::source_url( "https://raw.githubusercontent.com/davidbrae/swmap/4501e2c8927faaffa02c92d3e40d16beb44bca92/how%20to%20map%20the%20pesquisa%20nacional%20por%20amostra%20de%20domicilios.R" , echo = TRUE , prompt = FALSE )

and here is what happens at the point of the error. again La_rs appears corrupted.

> for ( i in 1:4 ){
+ 
+       this.krig.fit <-
+               Krig(
+                       cbind( x$x , x$y ) ,
+                       x[ , paste0( 'occcat' , i ) ] ,
+                       weights = x[ , paste0( 'weigh .... [TRUNCATED] 
Error in eigen(tempM, symmetric = TRUE) : 'a' must be a complex matrix
In addition: There were 50 or more warnings (use warnings() to see the first 50)
> traceback()
8: eigen(tempM, symmetric = TRUE)
7: Krig.engine.default(out, verbose = verbose)
6: Krig(cbind(x$x, x$y), x[, paste0("occcat", i)], weights = x[, 
       paste0("weight", i)]) at filee101515cee#676
5: eval(expr, envir, enclos)
4: eval(ei, envir)
3: withVisible(eval(ei, envir))
2: source(temp_file, ...)
1: downloader::source_url("https://raw.githubusercontent.com/davidbrae/swmap/master/how%20to%20map%20the%20pesquisa%20nacional%20por%20amostra%20de%20domicilios.R", 
       echo = T, prompt = F)
> sessionInfo()
R version 3.2.3 (2015-12-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows Server 2008 R2 x64 (build 7601) Service Pack 1

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] tcltk     grid      stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] fields_8.3-6      maps_3.1.0        spam_1.3-0        ggplot2_2.0.0     raster_2.5-2     
 [6] sqldf_0.4-10      RSQLite_1.0.0     gsubfn_0.6-6      proto_0.3-10      rgeos_0.3-17     
[11] rgdal_1.1-3       sp_1.2-2          digest_0.6.9      ff_2.2-13         bit_1.1-12       
[16] descr_1.1.2       SAScii_1.0        stringr_1.0.0     R.utils_2.2.0     R.oo_1.19.0      
[21] R.methodsS3_1.7.0 RCurl_1.95-4.6    bitops_1.0-6      MonetDBLite_0.2.0 MonetDB.R_1.0.1  
[26] DBI_0.3.1         survey_3.30-3     downloader_0.4   

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.3      plyr_1.8.3       tools_3.2.3      gtable_0.1.2     lattice_0.20-33 
 [6] magrittr_1.5     scales_0.3.0     codetools_0.2-14 xtable_1.8-0     colorspace_1.2-6
[11] stringi_1.0-1    munsell_0.4.2    chron_2.3-47    
> 

february 27th, 2016 edit: very similar bug, adding gc() in the middle of this script prevents the crash

# account creation page
# http://www.icpsr.umich.edu/rpxlogin?path=NACJD&request_uri=https%3a%2f%2fwww.icpsr.umich.edu%2ficpsrweb%2fNACJD%2f
your.username <- 'email@address.com'
your.password <- 'some_password'

setwd( "C:/My Directory/NCVS_BUG/" )
library(downloader)
source_url( "https://gist.githubusercontent.com/ajdamico/4cd5f76aebbdaae5bc88/raw/1ae140e84aa82f1c12af297badad6d8c2c50f5a1/ncvs_bughunt.R" , echo = TRUE , prompt = FALSE )
peterh
  • 11,875
  • 18
  • 85
  • 108
Anthony Damico
  • 5,779
  • 7
  • 46
  • 77
  • I get `Error in paste0("\"", path.to.7z, "\" x ", tf, " -aoa -o\"", tempdir(), : object 'path.to.7z' not found` running your code. –  Feb 17 '16 at 04:53
  • @Pascal ahh i'm sorry, see edit for mac/unix users – Anthony Damico Feb 17 '16 at 05:01
  • Finished. No error as far as I can see. R version 3.2.3 (2015-12-10) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 14.04.3 LTS –  Feb 17 '16 at 05:35
  • I must add that in the working directory, your script created a sub-directory called "MonetDB", as well as a picture ("2013 alaskan veteran service eras.png") and an image file ("acs2013_1yr.rda"). –  Feb 17 '16 at 08:08
  • @Pascal thanks for trying, i've edited the question to clarify this is only reproducible on windows. those files are correct – Anthony Damico Feb 17 '16 at 10:56
  • Have you tried running [RevolutionR open](https://mran.revolutionanalytics.com/open/)? The advantage of RevolutionR open is that it is linked against Intel MKL math library. Not sure whether `La_rs` relies on any library that resides in C but you might just give it a shot. – Stereo Feb 21 '16 at 18:57
  • @Stereo good suggestion- i have just confirmed that the crash-without-explanation occurs on revolutionr open for windows as well. thanks – Anthony Damico Feb 21 '16 at 21:09
  • I am still running your script but in the mean time I was looking at the [source code](https://github.com/wch/r-source/blob/b156e3a711967f58131e23c1b1dc1ea90e2f0c43/src/modules/lapack/Lapack.c) of `La_rs` and there seems to be various references to LAPACK. Since we have no idea whether Intel MKL has a different implementation of the routines called you might give other LAPACK / BLAS libraries a shot following these instructions [1](http://www.r-bloggers.com/an-openblas-based-rblas-for-windows-64/) [2](https://cran.r-project.org/doc/manuals/r-release/R-admin.html#Shared-BLAS). – Stereo Feb 22 '16 at 01:09
  • Just a couple of things that I think should be noted in the question - as currently constituted, reproduction of this problem requires downloading 500MB+ of data. Also the scripts overwrite the download file with each zip that's downloaded if you run the scripts "as is" like I just did... – J Richard Snape Feb 22 '16 at 16:03
  • @JRichardSnape fair point, i've edited the Q to warn users of this – Anthony Damico Feb 22 '16 at 16:13
  • Grepping the codebase - the error must be thrown from [within the La_zgecon function in lapack.c](https://github.com/wch/r-source/blob/b156e3a711967f58131e23c1b1dc1ea90e2f0c43/src/modules/lapack/Lapack.c#L443) This should only get called if you are dealing with `complex` data types. That makes me think that `eigen` is calling the complex version of `La_rs` i.e. `La_rs_complex` [here](https://github.com/wch/r-source/blob/b156e3a711967f58131e23c1b1dc1ea90e2f0c43/src/library/base/R/eigen.R#L52) This would imply your matrix is symmetric but has complex type. Does this ring a bell? – J Richard Snape Feb 22 '16 at 16:19
  • To check for this - you could run in debug mode and then check `complex.INDATA` where `INDATA` is the name of the matrix you're passing into the krigging function – J Richard Snape Feb 22 '16 at 16:25
  • @JRichardSnape i like where you're going with this.. the error has a lowercase `a` in it so i think the actual break occurs [within `La_solve_cmplx`](https://github.com/wch/r-source/blob/b156e3a711967f58131e23c1b1dc1ea90e2f0c43/src/modules/lapack/Lapack.c#L532)? none of this really rings a bell, because i haven't spent much time with any of the underlying c code here.. – Anthony Damico Feb 22 '16 at 17:11
  • Yeah - that makes more sense, TBH - I think my search found that occurence, but then I was lazy with the "find" on the github page (my mistake). I'll try to work out why the code is going down the `complex` execution path when all the numbers you're dealing with should be real. – J Richard Snape Feb 22 '16 at 17:21
  • @JRichardSnape thanks. very annoyingly, when i scatter the La_rs() that triggers the final crash throughout the script, it does not crash at all. see _february 22nd 2016_ edit for details.. – Anthony Damico Feb 22 '16 at 20:53

1 Answers1

2

Wow, that error is hard to reproduce. The number of steps needed to repro is likely why you haven't had many answers.

I finally managed to get all the data downloaded and packages installed to repro, but haven't got your code (which is fairly involved) to get to the point where you indicate it fails as yet.

As per the comments I made, the error messages indicate that is that lapack is trying to execute the complex version of the functions you are using (error thrown by this line), but the type of the input variable is not a matrix, so it triggers the error. This is most likely the low level root cause.

The question you probably really want the answer to, though, is why does this happen?

I suspect that this means that the input data for one of your steps is either empty, or 1 dimensional. I will continue to try to reproduce to prove this theory.

Community
  • 1
  • 1
J Richard Snape
  • 20,116
  • 5
  • 51
  • 79
  • hi, thank you for keeping at it - know what's even weirder? calling a gc() in the middle of the script prevents the error. https://github.com/davidbrae/swmap/commit/820af7eb554e1bdfbeb439daf92d0daa77b4e2f2 it's a very fragile bug somewhere, about time i got some debug builds and learned gdb.. – Anthony Damico Feb 26 '16 at 16:50
  • i wonder if this is of any use.. https://stat.ethz.ch/R-manual/R-devel/library/base/html/gctorture.html ..since the `gc()` points to a memory allocation bug? – Anthony Damico Feb 26 '16 at 22:53
  • I'm thinking it's a memory bug where a really big matrix gets nastily converted under the cover. I'll keep chasing, because it's interesting. – J Richard Snape Feb 26 '16 at 23:38
  • thanks. if you uncover it, i'd love to know your sleuthing process – Anthony Damico Feb 26 '16 at 23:54
  • i've added another bug that appears to be the exact same issue: `gc()` corruption in behind-the-scenes C code, feb. 27 2016 edit – Anthony Damico Feb 27 '16 at 21:37