2

I need to read a netCDF file into R that is stored on a remote filesystem. I do have ssh access to the filesystem, but the files are too big to store onto my local computer.

I have tried the advice from here: Can R read from a file through an ssh connection? I tried the following:

library(ncdf)
d = open.ncdf(pipe('ssh hostname "path/to/file/foo.nc"'))

However, I keep getting the error

bash: path/to/file/foo.nc: Permission denied

Any ideas on how to fix this?

Community
  • 1
  • 1

3 Answers3

2

It is not possible to open the file directly from within R using ssh, but there are a few options available to you.

1. Mount the remote server as a local filesystem over ssh.

There are packages which will let you mount remote machines as local filesystems over ssh; on Linux, for example, you might use sshfs whereas on Windows you might use win-sshfs. Once you've mounted the remote file system, you would be able to access the netcdf files from R just as you would any other file, although I'm not sure what the performance implications may be.

2. Break the larger files down into smaller files.

Use the command-line ncdump utility, on the server, to create smaller files from the large files which are able to fit on your local file system.

$ ncdump -v [var1],[var2] big.nc > smaller.cdl

smaller.cdl will be a text file; you can generate a binary netcdf .nc file by using ncgen:

$ ncgen -b -o smaller.nc smaller.cdl

3. Use an OpenDAP service on the remote server.

Unless your remote server is already set up to provide OpenDAP service, this is probably overkill. But if it is, you may use a combination of R's OPeNDAP access and netCDF's OPenDAP subset service to retrieve data subsets on the fly. You can also use ncdump on your local machine to request a subset of data from the server.

Ward F.
  • 331
  • 1
  • 5
1

I'd try and arrange a samba or NFS share. After that you can simply approach the file as any other.

Paul Hiemstra
  • 59,984
  • 12
  • 142
  • 149
1

It is not possible to do it via ssh. The pipe command executes a shell command. You are trying to execute path/to/file/foo.nc, which fails because it is not an executable. The examples you gave read output from stdin, which is parsed by R. This is not the same.

The closest you could get is to use ncdump on the remote machine, which can be used to convert variables from the files into a text version, which you may be able to parse.

tiago
  • 22,602
  • 12
  • 72
  • 88