I am connecting to a SQL server database in R using the RODBC package and making multiple calls to the database, doing something with the result, writing the result to disk and removing all variables. I then call gc() and close the channel before starting the next iteration. This is roughly shown below.
for(i in 1:1000){
ch <- odbcConnect("some dsn")
result = sqlQuery(ch, 'SELECT * from table where some_condition = i')
#some processing here...
saveRDS(result, 'result.rdata')
# this should remove all object in memory
rm(result)
gc()
odbcClose(ch)
}
My problem is that as the loop progresses, the amount of memory being used by R gets larger and larger (as revelaed in the windows Resouce Monitor), despite the total size of the objects in the global environment not increasing. Eventually, I use up all of the memory available on my machine (16gb, Windows 7, 64 bit) and the process crashes.
- Why is this happenning?
- How can I force R to free up the memory without restarting R?
Things I have tried that did not work:
- multiple calls to gc() as suggested here https://stackoverflow.com/a/1467334/366502.
changing the buffsize, rows_at_time and believeNRows arguments in sqlQuery()
sessionInfo() R version 3.2.0 (2015-04-16) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 7 x64 (build 7601) Service Pack 1
locale: [1] LC_COLLATE=English_United Kingdom.1252 LC_CTYPE=English_United Kingdom.1252 LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C LC_TIME=English_United Kingdom.1252
attached base packages: [1] stats graphics grDevices utils datasets methods base
other attached packages: [1] data.table_1.9.4 DBI_0.3.1 rsqlserver_1.0 rClr_0.7-4 text_0.0.0.9000 stringr_1.0.0 plyr_1.8.3 RODBC_1.3-11 rvest_0.2.0 tomr_0.0.0.9000
loaded via a namespace (and not attached): [1] Rcpp_0.12.0 lattice_0.20-31 XML_3.98-1.2 slam_0.1-32 grid_3.2.0 chron_2.3-47 R6_2.1.1 magrittr_1.5 httr_1.0.0 stringi_0.5-5 reshape2_1.4.1 NLP_0.1-8 Matrix_1.2-1 tools_3.2.0
[15] parallel_3.2.0 tm_0.6-2