1

I’m trying to subset 4 nc-files to get a shorter time period. Each file contains 94 years, and I’d like to subset 20 years in each file. I’m importing the 4 files with list.files (which is must I think since I will ultimately have even more files than 4, so I cannot import each file separately) and then converting them into one raster object. I think I should subset the shorter time period when the files are still contained as a list and before they are made into a raster object, since I’m unsure about what happens with the 4 nc-files are stacked into one raster object, i.e. how they are merged.

I can see the time dimension is stored, for each nc-file, as 94 years starting from 345 (=i.e. 2006 from the baseline 1661). But I cannot figure out how to subset the 4 files for a shorter period, e.g. 2006-2007, and then converting to a raster object.

#Open 4 files
leh_rcp6.0 = list.files("heatwave/rcp6.0",pattern='*.nc',full.names=TRUE)

[1] "heatwave/rcp6.0/lange2020_hwmid-humidex_gfdl-esm2m_ewembi_rcp60_nosoc_co2_leh_global_annual_2006_2099.nc4"  
[2] "heatwave/rcp6.0/lange2020_hwmid-humidex_hadgem2-es_ewembi_rcp60_nosoc_co2_leh_global_annual_2006_2099.nc4"  
[3] "heatwave/rcp6.0/lange2020_hwmid-humidex_ipsl-cm5a-lr_ewembi_rcp60_nosoc_co2_leh_global_annual_2006_2099.nc4"
[4] "heatwave/rcp6.0/lange2020_hwmid-humidex_miroc5_ewembi_rcp60_nosoc_co2_leh_global_annual_2006_2099.nc4" ``` 

# Combine files into Rasterstack object
leh_rcp6.0_stack <- stack(leh_rcp6.0)

Information for 1 nc-file to see how the dimensions are stored:

test <- nc_open("lange2020_hwmid-humidex_gfdl-esm2m_ewembi_rcp60_nosoc_co2_leh_global_annual_2006_2099.nc4") 

(NC_FORMAT_NETCDF4_CLASSIC):

     1 variables (excluding dimension variables):
        float leh[lon,lat,time]   (Chunking: [720,360,1])  (Compression: shuffle,level 5)
            _FillValue: 1.00000002004088e+20
            missing_value: 1.00000002004088e+20
            standard_name: land_area_fraction_exposed_to_heatwave
            long_name: Land area fraction exposed to heatwave
            units: 1

     3 dimensions:
        lon  Size:720 
            standard_name: longitude
            units: degrees_east
            axis: X
            long_name: Longitude
        lat  Size:360 
            standard_name: latitude
            units: degrees_north
            axis: Y
            long_name: Latitude
        time  Size:94   *** is unlimited *** 
            standard_name: time
            units: years since 1661-1-1 00:00:00
            calendar: 360_day
            axis: T
            long_name: Time```

time_test <- ncvar_get(test, "time")
head(time_test) # to have a look at the time dimension

[1] 345 346 347 348 349 350
MoonS
  • 117
  • 7

1 Answers1

0

Your data files are most likely formatted using the CF Metadata Conventions, which is commonly used for climate projection data. Your humidex data is obviously derived from such climate projections data, seeing that GFDL, HadGEM2, IPSL and MIROC are referenced in the file names. On the other hand, "years since 1661-1-1 00:00:00" is an uncommon unit string (CMIP5 and CMIP6 files would use "days since..."). That's too bad, otherwise you could have used the CFtime package available on CRAN, but that only supports the CF suggested units of days, hours, minutes and seconds. Have a look at the vignette in that package anyway, because it has a worked-out example of how to proceed.

The general approach, with some manual interventions, and using the current terra package, would be:

  • Fetch all the file names in a list
  • lapply() over the list to open the files and grab the data
  • 2006 and 2007 are the first two time slices in each file, so read out just those two layers
  • Create a raster object from those two layers

In R that becomes:

# Make a list of file names
files <- list.files(path = "heatwave/rcp6.0", pattern='*.nc', full.names=TRUE)

# Iterate over the list
all_data <- lapply(files, function(fn) {
  # Get the first two layers, representing 2006 and 2007
  nc <- nc_open(fn)
  d <- ncvar_get(nc, "leh", c(1, 1, 1), c(-1, -1, 2))
  nc_close(nc)

  # Rearrange the data, then create the SpatRaster
  r <- rast(aperm(d, c(2, 1, 3)), extent = c(-180, 180, -90, 90))
  names(r) <- 2006:2007
  r
})

all_data is a list with each element a SpatRaster which you can further process as required.

Patrick
  • 29,357
  • 6
  • 62
  • 90