0

Consider these two pieces of code. In the first one, things work normally, and the memory usage of R is stable:

for (i in 1:100) {
x <- rnorm(1000000)
write.table(x, file="test", col.names=F, append=T)
}

Now consider this related code, where I am scraping information from the World Bank about some economic indicator. Here, the memory usage goes up as the loop is iterated:

library(RCurl)
library(XML)
for (i in 1:26) {
x <- getURL(paste("http://api.worldbank.org/countries/all/indicators/AG.AGR.TRAC.NO?per_page=500&date=1960:2012&page=", as.character(i), sep=""))
x <- xmlToDataFrame(x)
write.table(x, file="test", col.names=F, append=T)
}

What is the difference between these two snippets from the point of view of writing data, and how can I ensure that the second one releases memory properly?

tshepang
  • 12,111
  • 21
  • 91
  • 136
qua
  • 972
  • 1
  • 8
  • 22
  • My R is version 2.15 and my XML is version 3.9-4.1, which seem to be the most recent updates. – qua Jun 21 '12 at 23:30
  • i've tried updating to XML version 3.93-0 via downloading the source code and using Rtools, but to no avail. also downloading via the repository at omegahat.org doesn't work – qua Jun 22 '12 at 00:31
  • Your code works perfectly for me. Are working on a 32bit or 64bit system? – Davy Kavanagh Jun 22 '12 at 09:20
  • i'm working on 64bit. the linked stackoverflow page above mentions an updated binary for xml at omegahat, but this isn't available from it – qua Jun 22 '12 at 12:53

1 Answers1

0

Ok, I did this to make it work. I downloaded the binary from http://www.omegahat.org/R/bin/windows/contrib/2.14/, and installed using install.packages("XML", repos=NULL). It only worked in 32 bit R.

qua
  • 972
  • 1
  • 8
  • 22