18

I'm trying to read MODIS 17 data files into R, manipulate them (cropping etc.) and then save them as geoTIFF's. The data files come in .hdf format and there doesn't seem to be an easy way to read them into R.

Compared to other topics there isn't a lot of advice out there and most of it is several years old. Some of it also advises using additional programmes but I want to stick with just using R.

What package/s do people use for dealing with .hdf files in R?

James
  • 1,164
  • 2
  • 15
  • 36
  • Did you check [*this article*](http://www.r-bloggers.com/working-with-hdf-files-in-r-example-pathfinder-sst-data/) on R-bloggers? It provides a working example, whether the offered solution is *easy* it would be a matter of individual judgement. In terms of your second question, is the [`writeRater`](http://www.inside-r.org/packages/cran/raster/docs/writeRaster) method offer in the [**`raster`**](http://www.inside-r.org/packages/cran/raster) package of any use to you? – Konrad Apr 21 '16 at 14:18
  • @Konrad Yes I came across that article and spent a while trying to get it to work. Problem is I keep getting the following error when I try and use the h5ls function on my data: Error in H5Fopen(file, "H5F_ACC_RDONLY") : HDF5. File accessability. Unable to open file. I use the raster package a lot, its the reading in and manipulation of .hdf files I'm interested in people's opinions on as that's what's causing me problems. – James Apr 21 '16 at 14:25
  • Please consider making your problem reproducible on the lines of the suggestions expressed in [this discussion](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example?lq=1); it will be easier to help you. – Konrad Apr 21 '16 at 14:26
  • 2
    In case future readers encounter this problem, the 'MODISTools' package in R provides good functionality for getting MODIS data into R (but is not a general tool for dealing with HDF4) – Jacob Socolar Jun 06 '17 at 20:37

5 Answers5

27

Ok, so my MODIS hdf files were hdf4 rather than hdf5 format. It was surprisingly difficult to discover this, MODIS don't mention it on their website but there are a few hints in various blogs and stack exchange posts. In the end I had to download HDFView to find out for sure.

R doesn't do hdf4 files and pretty much all the packages (like rgdal) only support hdf5 files. There are a few posts about downloading drivers and compiling rgdal from source but it all seemed rather complicated and the posts were for MAC or Unix and I'm using Windows.

Basically gdal_translate from the gdalUtils package is the saving grace for anyone who wants to use hdf4 files in R. It converts hdf4 files into geoTIFFs without reading them into R. This means that you can't manipulate them at all e.g. by cropping them, so its worth getting the smallest tiles you can (for MODIS data through something like Reverb) to minimise computing time.

Here's and example of the code:

library(gdalUtils)

# Provides detailed data on hdf4 files but takes ages

gdalinfo("MOD17A3H.A2000001.h21v09.006.2015141183401.hdf")

# Tells me what subdatasets are within my hdf4 MODIS files and makes them into a list

sds <- get_subdatasets("MOD17A3H.A2000001.h21v09.006.2015141183401.hdf")
sds

[1] "HDF4_EOS:EOS_GRID:MOD17A3H.A2000001.h21v09.006.2015141183401.hdf:MOD_Grid_MOD17A3H:Npp_500m"   
[2] "HDF4_EOS:EOS_GRID:MOD17A3H.A2000001.h21v09.006.2015141183401.hdf:MOD_Grid_MOD17A3H:Npp_QC_500m"

# I'm only interested in the first subdataset and I can use gdal_translate to convert it to a .tif

gdal_translate(sds[1], dst_dataset = "NPP2000.tif")

# Load and plot the new .tif

rast <- raster("NPP2000.tif")
plot(rast)

# If you have lots of files then you can make a loop to do all this for you

files <- dir(pattern = ".hdf")
files

 [1] "MOD17A3H.A2000001.h21v09.006.2015141183401.hdf" "MOD17A3H.A2001001.h21v09.006.2015148124025.hdf"
 [3] "MOD17A3H.A2002001.h21v09.006.2015153182349.hdf" "MOD17A3H.A2003001.h21v09.006.2015166203852.hdf"
 [5] "MOD17A3H.A2004001.h21v09.006.2015099031743.hdf" "MOD17A3H.A2005001.h21v09.006.2015113012334.hdf"
 [7] "MOD17A3H.A2006001.h21v09.006.2015125163852.hdf" "MOD17A3H.A2007001.h21v09.006.2015169164508.hdf"
 [9] "MOD17A3H.A2008001.h21v09.006.2015186104744.hdf" "MOD17A3H.A2009001.h21v09.006.2015198113503.hdf"
[11] "MOD17A3H.A2010001.h21v09.006.2015216071137.hdf" "MOD17A3H.A2011001.h21v09.006.2015230092603.hdf"
[13] "MOD17A3H.A2012001.h21v09.006.2015254070417.hdf" "MOD17A3H.A2013001.h21v09.006.2015272075433.hdf"
[15] "MOD17A3H.A2014001.h21v09.006.2015295062210.hdf"

filename <- substr(files,11,14)
filename <- paste0("NPP", filename, ".tif")
filename

[1] "NPP2000.tif" "NPP2001.tif" "NPP2002.tif" "NPP2003.tif" "NPP2004.tif" "NPP2005.tif" "NPP2006.tif" "NPP2007.tif" "NPP2008.tif"
[10] "NPP2009.tif" "NPP2010.tif" "NPP2011.tif" "NPP2012.tif" "NPP2013.tif" "NPP2014.tif"

i <- 1

for (i in 1:15){
  sds <- get_subdatasets(files[i])
  gdal_translate(sds[1], dst_dataset = filename[i])
}

Now you can read your .tif files into R using, for example, raster from the raster package and work as normal. I've checked the resulting files against a few I converted manually using QGIS and they match so I'm confident the code is doing what I think it is. Thanks to Loïc Dutrieux and this for the help!

James
  • 1,164
  • 2
  • 15
  • 36
  • I have not gotten this to work for a large ~290 .hdf files. Has any one else run into troubles? – Evan Oct 04 '16 at 01:45
  • @James. I posted answer below as this conversion yields different results compared to ArcGIS. Any tips? – MIH Oct 27 '16 at 13:28
  • Hey thanks for this answer. I have tried using this code, which works well on 250m & 500m MODIS product, but on the 1km product returns 15 band TIFs rather than 36. Also, the get_subdatasets function doesn't appear to return all the 36 data bands in the file. Any thoughts? – Ollie Perkins Oct 16 '18 at 17:28
  • Linux `file` command would tell you the HDF version easily. – Rodrigo Dec 17 '22 at 17:41
11

These days you can use the terra package with HDF files

Either get sub-datasets

 library(terra)
 s <- sds("file.hdf")
 s

That can be extracted as SpatRasters like this

s[1]

Or create a SpatRaster of all subdatasets like this

r <- rast("file.hdf")
Robert Hijmans
  • 40,301
  • 4
  • 55
  • 63
2

This script has been very useful and I managed to convert a batch of 36 files using it. However, my problem is that the conversion does not seem correct. When I do it using ArcGIS 'Make NetCDF Raster Layer tool', I get different results + I am able to convert the numbers to C from Kelvin using simple formula: RasterValue * 0.02 - 273.15. With the results from R conversion I don't get the right results after conversion which leads me to believe ArcGIS conversion is good, and R conversion returns an error.

library(gdalUtils)
library(raster)

setwd("D:/Data/Climate/MODIS")

# Get a list of sds names
sds <- get_subdatasets('MOD11C3.A2009001.006.2016006051904.hdf')
# Isolate the name of the first sds
name <- sds[1]
filename <- 'Rasterinr.tif'

gdal_translate(sds[1], dst_dataset = filename)
# Load the Geotiff created into R
r <- raster(filename)

# Identify files to read:
rlist=list.files(getwd(), pattern="hdf$", full.names=FALSE)


# Substract last 5 digits from MODIS filename for use in a new .img filename
substrRight <- function(x, n){
        substr(x, nchar(x)-n+1, nchar(x))
}

filenames0 <- substrRight(rlist,9)
# Suffixes for MODIS files for identyfication:
filenamessuffix <- substr(filenames0,1,5)

listofnewnames <- c("2009.01.MODIS_","2009.02.MODIS_","2009.03.MODIS_","2009.04.MODIS_","2009.05.MODIS_",
                    "2009.06.MODIS_","2009.07.MODIS_","2009.08.MODIS_","2009.09.MODIS_","2009.10.MODIS_",
                    "2009.11.MODIS_","2009.12.MODIS_",
                    "2010.01.MODIS_","2010.02.MODIS_","2010.03.MODIS_","2010.04.MODIS_","2010.05.MODIS_",
                    "2010.06.MODIS_","2010.07.MODIS_","2010.08.MODIS_","2010.09.MODIS_","2010.10.MODIS_",
                    "2010.11.MODIS_","2010.12.MODIS_",
                    "2011.01.MODIS_","2011.02.MODIS_","2011.03.MODIS_","2011.04.MODIS_","2011.05.MODIS_",
                    "2011.06.MODIS_","2011.07.MODIS_","2011.08.MODIS_","2011.09.MODIS_","2011.10.MODIS_",
                    "2011.11.MODIS_","2011.12.MODIS_")

# Final new names for converted files:
newnames <- vector()
for (i in 1:length(listofnewnames)) {
        newnames[i] <- paste0(listofnewnames[i],filenamessuffix[i],".img")
}

# Loop converting files to raster from NetCDF
for (i in 1:length(rlist)) {
        sds <- get_subdatasets(rlist[i])
        gdal_translate(sds[1], dst_dataset = newnames[i])
}
MIH
  • 1,083
  • 3
  • 14
  • 26
  • Did you manage to resolve this? I'm having similar issues with MODIS when converting; seems to lose the scaling factor, but values are still off. – SpatialProgramming Feb 28 '20 at 12:09
2

The following worked for me. It's a short program and just takes in the input folder name. Make sure you know which sub data you want. I was interested in sub data 1.

library(raster)
library(gdalUtils)

inpath <- "E:/aster200102/ast_200102"

setwd(inpath)

filenames <- list.files(,pattern=".hdf$",full.names = FALSE)

for (filename in filenames)
{
     sds <- get_subdatasets(filename)
     gdal_translate(sds[1], dst_dataset=paste0(substr(filename, 1, nchar(filename)-4) ,".tif"))
}
Dekker1
  • 5,565
  • 25
  • 33
  • It worked for me. its a short program and just takes in the input folder name. make sure you know which sub data you want. I was interested in sub data 1. – Rajasweta Datta Oct 15 '17 at 23:23
2

Use the HEG toolkit provided by NASA to convert your hdf file to geotiff and then use any package ("raster" for example) to read the file. I do the same for both old and new hdf files.

Heres the link: https://newsroom.gsfc.nasa.gov/sdptoolkit/HEG/HEGHome.html

Take a look at the NASA products supported here: https://newsroom.gsfc.nasa.gov/sdptoolkit/HEG/HEGProductList.html

Hope this helps.