0

I have a very large Term Document Matrix in R with dimensions 81094 * 14177. On trying to convert it to a normal matrix I am getting an error

Error: cannot allocate vector of size 8.6 Gb

The code that I have used is

new_matrix = as.matrix(old_matrix).

Is there a way to handle such situations in R where memory is insufficient?

Community
  • 1
  • 1
NinjaR
  • 621
  • 6
  • 22
  • It would help if you could inform us on the data type of the TDM you are trying to convert. Generally speaking though, sparse matrices (`Matrix` package; note the capital letter) are the answer, since there are many zeroes in a TDM. – mpjdem Jan 24 '17 at 11:01
  • The class is "TermDocumentMatrix" "simple_triplet_matrix" – NinjaR Jan 24 '17 at 11:18
  • When I am using object.size on the old_matrix I am getting a size of 0.21 GB – NinjaR Jan 24 '17 at 13:28
  • The reason is that the TDM is using a sparse format (hence the `simple_triplet_matrix` format), not saving in memory the many, many zeroes. `matrix` to the contrary is not sparse and does save all the zeroes. `Matrix` is sparse if you use the `sparse=TRUE` argument - so you should try that. – mpjdem Jan 24 '17 at 13:46
  • @mpjdem- thanks a ton. will surely try this – NinjaR Jan 25 '17 at 05:40

0 Answers0