1

I'm trying to run topic models in R and calculate the best number of topics using the FindTopicsNumber function from the ldatuning package. If I run the following code on a macbook pro it fits the models but once it starts to calculate the first metric I get a fatal error and the R session is terminated. The code runs on a windows machine without problems. Does anyone know why it might not run on mac?

sessionInfo() output:

R version 4.0.5 (2021-03-31)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Catalina 10.15.7

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.5.4       rstudioapi_0.13    xml2_1.3.2         magrittr_2.0.1     tidyselect_1.1.0  
 [6] munsell_0.5.0      colorspace_2.0-0   tm_0.7-8           R6_2.5.0           rlang_0.4.8       
[11] dplyr_1.0.2        tools_4.0.5        parallel_4.0.5     grid_4.0.5         gtable_0.3.0      
[16] modeltools_0.2-23  ellipsis_0.3.1     tibble_3.0.4       lifecycle_0.2.0    crayon_1.3.4      
[21] NLP_0.2-1          purrr_0.3.4        ggplot2_3.3.2      vctrs_0.3.5        glue_1.4.2        
[26] slam_0.1-47        compiler_4.0.5     pillar_1.4.7       topicmodels_0.2-12 generics_0.1.0    
[31] scales_1.1.1       stats4_4.0.5       pkgconfig_2.0.3    ldatuning_1.0.2

The error producing mac:
MacBook Pro 13-inch late 2013
macOS Catalina 10.15.7
(tried with R 4.0.3 and 4.0.5)

The working windows machine:
Dell XPS 15 9550
Windows 10.0.14393
(tried with R 4.0.4 and 4.0.5)

library(topicmodels)
data("AssociatedPress")

owl <- ldatuning::FindTopicsNumber(AssociatedPress, topics = c(1:10),
                            metrics = c("Griffiths2004", "CaoJuan2009",
                                        "Arun2010", "Deveaud2014"),
                            method = "Gibbs", control = list(seed = 1234),
                            mc.cores = parallel::detectCores() - 1,
                            verbose = T)
  • Editing your question to include the output of `sessionInfo()` might help. Does leaving out the "Griffiths2004" metric produce the same error? – jared_mamrot May 11 '21 at 04:52
  • i added the sessionInfo output in the question. It seems its only the Griffiths metric that produces the error. If I run only the other three it works and if I only run with Griffiths it throws the same fatal error. – berndthebread May 11 '21 at 05:35

1 Answers1

0

Based on this github issue, and the observation that only the griffiths metric causes the failure, the problem appears to be caused by the Rmpfr package. Reinstalling the package (i.e. install.packages("Rmpfr"); library(Rmpfr)) and/or building the package from source may solve the issue. For detailed instruction on compiling R packages from source see https://stackoverflow.com/a/65334247/12957340

Edit

I was able to install and run the command above using the following steps:

  1. Install required dependancies using homebrew:
brew install gsl
brew install gmp
brew install mpfr
  1. Edit .R/Makevars file to include these lines (after following instructions in https://stackoverflow.com/a/65334247/12957340):
FLIBS=-L/usr/local/gfortran/lib/gcc/x86_64-apple-darwin19/10.2.0 -L/usr/local/gfortran/lib -lgfortran -lquadmath -lm
CXX1X=/usr/local/gfortran/bin/g++
CXX98=/usr/local/gfortran/bin/g++
CXX11=/usr/local/gfortran/bin/g++
CXX14=/usr/local/gfortran/bin/g++
CXX17=/usr/local/gfortran/bin/g++

CC=/usr/local/gfortran/bin/gcc
CXX=/usr/local/gfortran/bin/g++

PKG_LIBS=-L/usr/local/opt/gettext/lib
CFLAGS=-I/usr/local/opt/gsl/include -I/usr/local/opt/gmp/include -I/usr/local/opt/mpfr/include
LDFLAGS=-L/usr/local/opt/gsl/lib -L/usr/local/opt/gmp/lib -L/usr/local/opt/mpfr/lib -lgsl -lgslcblas
  1. Install the required packages:
install.packages("topicmodels", type = "source")
install.packages("ldatuning", type = "source")
install.packages("Rmpfr")

And finally, run the example:

library(topicmodels)
library(ldatuning)
data("AssociatedPress")

owl <- ldatuning::FindTopicsNumber(AssociatedPress, topics = c(1:10),
                                   metrics = c("Griffiths2004", "CaoJuan2009",
                                               "Arun2010", "Deveaud2014"),
                                   method = "Gibbs", control = list(seed = 1234),
                                   mc.cores = parallel::detectCores() - 1,
                                   verbose = T)
#> warning: topics count can't to be less than 2, incorrect values was removed.
#> fit models... done.
#> calculate metrics:
#>  Griffiths2004... done.
#>  CaoJuan2009... done.
#>  Arun2010... done.
#>  Deveaud2014... done.
>sessionInfo()
R version 4.0.3 (2020-10-10)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Big Sur 10.16

Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib

locale:
[1] en_AU.UTF-8/en_AU.UTF-8/en_AU.UTF-8/C/en_AU.UTF-8/en_AU.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] ldatuning_1.0.2    topicmodels_0.2-12

loaded via a namespace (and not attached):
  [1] readxl_1.3.1         backports_1.2.1      Hmisc_4.5-0         
  [4] systemfonts_1.0.1    plyr_1.8.6           splines_4.0.3       
  [7] gmp_0.6-2            tfruns_1.5.0         usethis_2.0.1       
 [10] ggplot2_3.3.3        digest_0.6.27        htmltools_0.5.1.9000
 [13] matrixcalc_1.0-3     viridis_0.5.1        fansi_0.4.2         
 [16] magrittr_2.0.1       checkmate_2.0.0      memoise_2.0.0       
 [19] tm_0.7-8             cluster_2.1.1        openxlsx_4.2.3      
 [22] remotes_2.2.0        readr_1.4.0          extrafont_0.17      
 [25] vroom_1.4.0          extrafontdb_1.0      prettyunits_1.1.1   
 [28] jpeg_0.1-8.1         sem_3.1-11           colorspace_2.0-1    
 [31] haven_2.3.1          xfun_0.22            dplyr_1.0.5         
 [34] jsonlite_1.7.2       callr_3.6.0          crayon_1.4.1        
 [37] microbenchmark_1.4-7 lme4_1.1-26          zeallot_0.1.0       
 [40] survival_3.2-10      zoo_1.8-9            glue_1.4.2          
 [43] gtable_0.3.0         mi_1.0               car_3.0-10          
 [46] pkgbuild_1.2.0       Rttf2pt1_1.3.8       Rmpfr_0.8-4         
 [49] abind_1.4-5          scales_1.1.1         DBI_1.1.1           
 [52] rstatix_0.7.0        Rcpp_1.0.6           viridisLite_0.4.0   
 [55] htmlTable_2.1.0      tmvnsim_1.0-2        reticulate_1.18     
 [58] foreign_0.8-81       bit_4.0.4            Formula_1.2-4       
 [61] stats4_4.0.3         htmlwidgets_1.5.3    RColorBrewer_1.1-2  
 [64] lavaan_0.6-8         modeltools_0.2-23    ellipsis_0.3.2      
 [67] pkgconfig_2.0.3      rJava_0.9-13         farver_2.1.0        
 [70] nnet_7.3-15          utf8_1.2.1           janitor_2.1.0       
 [73] tidyselect_1.1.0     rlang_0.4.11         munsell_0.5.0       
 [76] cellranger_1.1.0     tools_4.0.3          cachem_1.0.5        
 [79] cli_2.5.0            generics_0.1.0       devtools_2.3.2      
 [82] broom_0.7.5          evaluate_0.14        stringr_1.4.0       
 [85] fastmap_1.1.0        arm_1.11-2           yaml_2.2.1          
 [88] processx_3.5.0       knitr_1.31           bit64_4.0.5         
 [91] fs_1.5.0             zip_2.1.1            purrr_0.3.4         
 [94] randomForest_4.6-14  nlme_3.1-152         whisker_0.4         
 [97] slam_0.1-48          xml2_1.3.2           compiler_4.0.3      
[100] rstudioapi_0.13      curl_4.3.1           png_0.1-7           
[103] testthat_3.0.2       ggsignif_0.6.1       gt_0.2.2            
[106] reprex_1.0.0         tibble_3.1.1         statmod_1.4.35      
[109] pbivnorm_0.6.0       stringi_1.5.3        ps_1.6.0            
[112] desc_1.3.0           gdtools_0.2.3        forcats_0.5.1       
[115] hrbrthemes_0.8.0     lattice_0.20-41      Matrix_1.3-2        
[118] tensorflow_2.4.0     keras_2.4.0          Amelia_1.7.6        
[121] nloptr_1.2.2.2       tabulizerjars_1.0.1  vctrs_0.3.8         
[124] pillar_1.6.1         lifecycle_1.0.0      BiocManager_1.30.12 
[127] data.table_1.14.0    cowplot_1.1.1        R6_2.5.0            
[130] latticeExtra_0.6-29  gridExtra_2.3        rio_0.5.26          
[133] sessioninfo_1.1.1    boot_1.3-27          MASS_7.3-53.1       
[136] assertthat_0.2.1     pkgload_1.2.0        rprojroot_2.0.2     
[139] withr_2.4.2          mnormt_2.0.2         parallel_4.0.3      
[142] hms_1.0.0            grid_4.0.3           rpart_4.1-15        
[145] labelled_2.8.0       tidyr_1.1.3          coda_0.19-4         
[148] minqa_1.2.4          rmarkdown_2.7        snakecase_0.11.0    
[151] carData_3.0-4        NLP_0.2-1            snowfall_1.84-6.1   
[154] lubridate_1.7.10     base64enc_0.1-3      bmem_1.8            
[157] tabulizer_0.2.2
jared_mamrot
  • 22,354
  • 4
  • 21
  • 46
  • i tried to install Rmpfr from source but always run into the following error: configure: error: Header file mpfr.h not found; maybe use --with-mpfr-include=INCLUDE_PATH ERROR: configuration failed for package ‘Rmpfr’ following either the instructions you linked or instruction on how to do it in MacOS Catalina did not work for me – berndthebread May 18 '21 at 03:51
  • i can install and re-install Rmpfr with install.packages but that does not help, its still failing. I can't install it from source with install.packages("Rmpfr", type = "source") – berndthebread May 18 '21 at 04:08
  • I wasn't able to install Rmpfr from source either, but it didn't matter on my system with ldatuning and topicmodels installed from source. Try the steps in my edited answer and hopefully it will work. – jared_mamrot May 18 '21 at 05:24
  • thanks for your suggestions! I tried them but unfortunately its still not working. I noticed that my `sessionInfo()` doesn't show `Rmpfr` in the loaded via namespace part. Even when I library it, and it shows as attached package but its still giving a fatal error at the Griffiths metric. – berndthebread May 25 '21 at 02:05
  • Sorry Torven :( I'm not sure what else to try. If you raise an issue on https://github.com/nikita-moor/ldatuning/issues and link to this stackoverflow post they might be able to help you get to the bottom of the problem. – jared_mamrot May 25 '21 at 02:45