0

I am interested in learning about how to work with "Road Network" files - for example, I would like to find out the driving distance between two sets of geographical coordinates (i.e. longitude and latitude).

I found this shapefile for the Canadian Road Networks : https://www12.statcan.gc.ca/census-recensement/2011/geo/RNF-FRR/files-fichiers/lrnf000r22a_e.zip - now, I am trying to import this file into R.

Below, this is the code I am using to first download the shapefile to a temporary folder and then try to read the shapefile:

library(sf)
library(rgdal)
# Set the URL for the shapefile
url <- "https://www12.statcan.gc.ca/census-recensement/2011/geo/RNF-FRR/files-fichiers/lrnf000r22a_e.zip"

# Create a temporary folder to download and extract the shapefile
temp_dir <- tempdir()
temp_file <- file.path(temp_dir, "lrnf000r22a_e.zip")

# Download the shapefile to the temporary folder
download.file(url, temp_file)

# Extract the shapefile from the downloaded zip file
unzip(temp_file, exdir = temp_dir)

# Read the shapefile using the rgdal package
library(rgdal)
shapefile <- readOGR(dsn = temp_dir, layer = "lrnf000r22a_e")

But when trying to run the last line of code (readOGR), I get the following error:

OGR data source with driver: ESRI Shapefile 
Source: "C:\Users\me\AppData\Local\Temp\RtmpwDKofs", layer: "lrnf000r22a_e"
with 2246324 features
It has 21 fields
Integer64 fields read as strings:  OBJECTID 
Error: memory exhausted (limit reached?)
In addition: Warning messages:

This seems to be a very large shapefile and my computer does not have enough memory to work with this file.

First, I tried to inspect the properties (e.g. number of columns) of this file:

> ogrInfo(dsn = temp_dir, layer = "lrnf000r22a_e")
Source: "C:\Users\me\AppData\Local\Temp\RtmpwXsVlD", layer: "lrnf000r22a_e"
Driver: ESRI Shapefile; number of rows: 2246324 
Feature type: wkbLineString with 2 dimensions
Extent: (3696309 665490.8) - (9015653 4438073)
CRS: +proj=lcc +lat_0=63.390675 +lon_0=-91.8666666666667 +lat_1=49 +lat_2=77 +x_0=6200000 +y_0=3000000 +datum=NAD83 +units=m +no_defs 
LDID: 87 
Number of fields: 21 
        name type length  typeName
1   OBJECTID   12     10 Integer64
2    NGD_UID    4      9    String
3       NAME    4     50    String
4       TYPE    4      6    String
5        DIR    4      2    String
6    AFL_VAL    4      9    String
7    ATL_VAL    4      9    String
8    AFR_VAL    4      9    String
9    ATR_VAL    4      9    String
10  CSDUID_L    4      7    String
11 CSDNAME_L    4    100    String
12 CSDTYPE_L    4      3    String
13  CSDUID_R    4      7    String
14 CSDNAME_R    4    100    String
15 CSDTYPE_R    4      3    String
16   PRUID_L    4      2    String
17  PRNAME_L    4    100    String
18   PRUID_R    4      2    String
19  PRNAME_R    4    100    String
20      RANK    4      4    String
21     CLASS    4      4    String

I then tried to see if if its possible to read this in "chunks" (e.g. https://gis.stackexchange.com/questions/324374/read-n-number-of-rows-from-shapefile-using-geopandas) - for example, perhaps I could just read the file in chunks of "1000" rows until the whole file is imported:

test <- st_read(dsn = temp_dir, layer = "lrnf000r22a_e", n_max = 100, layer_options = c("GEOMETRY=AS_WKT", "FEATURE_TYPE=wkbLineString"))

I tried this but got the following error:

Reading layer `lrnf000r22a_e' from data source `C:\Users\me\AppData\Local\Temp\RtmpwXsVlD' using driver `ESRI Shapefile'
Error in st_sf(x, ..., agr = agr, sf_column_name = sf_column_name) : 
  no simple features geometry column present

Has anyone ever encountered these kinds of problems before? Is there a way to fix this?

Thanks!

UPDATE:

# https://stackoverflow.com/questions/6457290/how-to-check-the-amount-of-ram
install.packages("memuse")
library(memuse)

 memuse::Sys.meminfo()
Totalram:    7.648 GiB 
Freeram:   678.750 MiB 

# after re-starting my computer and opening R
memuse::Sys.meminfo()
Totalram:  7.648 GiB 
Freeram:   1.467 GiB 
stats_noob
  • 5,401
  • 4
  • 27
  • 83
  • 2
    The largest component in this shapefile is 1.1GB (about [half](https://desktop.arcgis.com/en/arcmap/latest/manage-data/shapefiles/geoprocessing-considerations-for-shapefile-output.htm) the limit of the shapefile spec). Most modern PCs can read in any shapefile because of the 2GB limit. This file uses 2GB of my RAM once read in, and `st_read()` max RAM usage was never higher than this (according to `bench::mark()`).. How much free RAM do you have? If you have less than 2GB you will struggle to read in a shapefile of all Canada. You might want to increase the size of your swap file. – SamR Apr 02 '23 at 19:05
  • @ SamR: Thank you for your reply! I posted an update that showed how much RAM I have. – stats_noob Apr 02 '23 at 20:48
  • 1
    With 700mb of RAM you have little chance of reading a file that's 1.1GB on disk, in chunks or otherwise. If you can't use a machine with more RAM, try restarting your PC and closing any RAM hogs (e.g. web browsers, OneDrive and MS Teams on Windows) and/or [increasing your virtual memory](https://stackoverflow.com/a/1395256/12545041). – SamR Apr 02 '23 at 21:01
  • @ SamR; thank you for your reply! I will re-start my computer and try everything again, Is there some way to increase memory allocation in R? e.g. memory.limit(size=xyz)? – stats_noob Apr 02 '23 at 21:05
  • @ SamR: I also posted another update about my RAM after re-starting my computer... – stats_noob Apr 02 '23 at 21:09
  • Well if you want to increase the memory available to R to be larger than your physical RAM, there will be some operating system specific setting for your swap file, which is disk storage set aside to use as extra RAM. If you Google this for your OS you should find instructions. You may need to tell R to use this by setting the virtual memory explicitly, as in the link in my previous comment. I don't know if this will definitely work with these packages - I haven't tested it - but it's the only way I can think of to circumvent running out of RAM with a shapefile in R – SamR Apr 03 '23 at 06:22

0 Answers0