3

I am connecting to a SQL server database in R using the RODBC package and making multiple calls to the database, doing something with the result, writing the result to disk and removing all variables. I then call gc() and close the channel before starting the next iteration. This is roughly shown below.

for(i in 1:1000){
    ch <- odbcConnect("some dsn")

    result = sqlQuery(ch, 'SELECT * from table where some_condition = i')
    #some processing here...


    saveRDS(result, 'result.rdata')
    # this should remove all object in memory
    rm(result)
    gc()
    odbcClose(ch)
}

My problem is that as the loop progresses, the amount of memory being used by R gets larger and larger (as revelaed in the windows Resouce Monitor), despite the total size of the objects in the global environment not increasing. Eventually, I use up all of the memory available on my machine (16gb, Windows 7, 64 bit) and the process crashes.

  • Why is this happenning?
  • How can I force R to free up the memory without restarting R?

Things I have tried that did not work:

  • multiple calls to gc() as suggested here https://stackoverflow.com/a/1467334/366502.
  • changing the buffsize, rows_at_time and believeNRows arguments in sqlQuery()

    sessionInfo() R version 3.2.0 (2015-04-16) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 7 x64 (build 7601) Service Pack 1

    locale: [1] LC_COLLATE=English_United Kingdom.1252 LC_CTYPE=English_United Kingdom.1252 LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C LC_TIME=English_United Kingdom.1252

    attached base packages: [1] stats graphics grDevices utils datasets methods base

    other attached packages: [1] data.table_1.9.4 DBI_0.3.1 rsqlserver_1.0 rClr_0.7-4 text_0.0.0.9000 stringr_1.0.0 plyr_1.8.3 RODBC_1.3-11 rvest_0.2.0 tomr_0.0.0.9000

    loaded via a namespace (and not attached): [1] Rcpp_0.12.0 lattice_0.20-31 XML_3.98-1.2 slam_0.1-32 grid_3.2.0 chron_2.3-47 R6_2.1.1 magrittr_1.5 httr_1.0.0 stringi_0.5-5 reshape2_1.4.1 NLP_0.1-8 Matrix_1.2-1 tools_3.2.0
    [15] parallel_3.2.0 tm_0.6-2

Community
  • 1
  • 1
Tom Liptrot
  • 2,273
  • 21
  • 23
  • 1
    Might be worth trying [RSQLServer](https://cran.r-project.org/web/packages/RSQLServer/index.html), and if you don't see the same problem then you'd (a) have a solution and (b) know that the problem is probably in RODBC's C code somewhere. – joran Sep 09 '15 at 15:42
  • My eventual solution to this has been to use Rscript.exe and start a fresh instances of R for each iteration of the loop. – Tom Liptrot Sep 14 '15 at 11:30

0 Answers0