1

I am trying to download multiple .nc files from OpenDAP. When I download the files manually (without a script) the files work as expected. To try speed the process up, I have a script that batch downloads data. However, when I download data using xarray the files are 10x larger and the files seem to be corrupted.

My script looks like this:

import pandas as pd
import xarray as xr
import os
import numpy as np

dates = pd.date_range(start='2016-01-01',end='2016-01-05',freq='D')
my_url = "http://www.ifremer.fr/opendap/cerdap1/ghrsst/l4/saf/odyssea-nrt/data/"

print("  ")
print("Downloading data from OPeNDAP - sit back, relax, this will take a while...")
print("...")
print("...")

# Create a list of url's 
data_url = []
cnt = 0
for i in np.arange(1,5):
    ii = i+1

    data_url.append(my_url + str(dates[cnt].year)+"/"+ str('%03d'%+ii)+"/"\
        +str(dates[cnt+1].year)+str('%02d'%dates[cnt+1].month)+str('%02d'%dates[cnt+1].day)\
        +"-IFR-L4_GHRSST-SSTfnd-ODYSSEA-SAF_002-v2.0-fv1.0.nc?time[0:1:0],lat[0:1:1749],lon[0:1:2249],analysed_sst[0:1:0][0:1:1749][0:1:2249],analysis_error[0:1:0][0:1:1749][0:1:2249],mask[0:1:0][0:1:1749][0:1:2249],sea_ice_fraction[0:1:0][0:1:1749][0:1:2249]")

    cnt = cnt+1

url_list = data_url

# Download data from the url's
count = 0
for data in url_list:
    print('Downloading file:', str(count))
    ds = xr.open_dataset(data,autoclose=True)
    fname = 'SAFodyssea_sst'+str(dates[count+1].year)+str('%02d'%dates[count+1].month)+str('%02d'%dates[count+1].day)+'.nc'
    ds.to_netcdf(fname)
    count = count +1
    del ds, fname

print('DONE !!!')

I have xarray version 0.10.8. I have tried running this using python 2.7 and python 3.5.6 as well as on windows 10 and Ubuntu 16.04 and I get the same result.

Your help is much appreciated.

Jetman
  • 765
  • 4
  • 14
  • 30
  • There’s no particular reason why you should use opendap to access this data. If you can access a server with netcdf files, you can download them using Python, e.g., with the requests library. – shoyer Nov 15 '18 at 07:57
  • @shoyer I'm not sure I completely understand. I only have access to the data through opendap – Jetman Nov 15 '18 at 08:05
  • How do you "download the files manually"? – shoyer Nov 15 '18 at 21:25
  • @shoyer I copy and paste the url into my web browser. If you have an alternative method to `xarray` i'd be interested to see. This is my first attempt at using a script to batch download data and I'm not sure how else to do this? – Jetman Nov 16 '18 at 04:22

1 Answers1

1

Each of these files as an associated URL for the netCDF file, e.g., http://www.ifremer.fr/opendap/cerdap1/ghrsst/l4/saf/odyssea-nrt/data/2018/001/20180101-IFR-L4_GHRSST-SSTfnd-ODYSSEA-SAF_002-v2.0-fv1.0.nc

One simple way to solve this problem would be to use a library such as requests to download each file, e.g., as described here: How to download large file in python with requests.py?

shoyer
  • 9,165
  • 1
  • 37
  • 55
  • Thanks very much. I ended up using `urllib` to download the data. It was a lot easier than I initially though it would be. – Jetman Nov 17 '18 at 10:44