0

Context
I'm currently working on a project involving osm data (Open Street Map). In order to manipulate geographic objects, I have to convert the data (an osm xml file) into an object. The osmar package lets me do this, but it fails to parse the raw xml data.

The error

Error in paste(file, collapse = "\n") : result would exceed 2^31-1 bytes

The code

require(osmar)
osmar_obj <- get_osm("anything", source = osmsource_file("my filename"))

Inside the get_osm function, the code calls ret <- xmlParse(raw), which triggers the error after a few seconds.

The question
How am I supposed to read a large XML file (here 10GB), knowing that I have 64G of memory ?

Thanks a lot !

VeilleData
  • 275
  • 1
  • 2
  • 10
  • For those wondering, I checked the version of R running and it is a **64bit** one. I also updated the XML package. – VeilleData Jul 22 '16 at 12:35
  • More details ! In the xmlParse function, the error is raised at line 12 : `file = paste(raw, collapse = "\n")`. Well, that is pretty much what the error said though. – VeilleData Jul 22 '16 at 12:54
  • Please edit your post with full code block not snippet of lines here and there. – Parfait Jul 22 '16 at 13:19

1 Answers1

0

This is the solution I came up with, even though it is not 100% satisfying.

  1. Transform the .osm file by removing every newline (but the last) in your shell
  2. Run the exact same code as before, skipping the paste that is not needed anymore (since you just did the equivalent in shell)

Profit :)

Obviously, I'm not very happy with it because modifying the data file in shell is more a trick that an actual solution :(

VeilleData
  • 275
  • 1
  • 2
  • 10