4

I am working on a classification problem with grib2 files. I have been trying Xarray, pynio, pygrib to read these data but couldn't solve it yet. Can anybody explain what is the structure of grib2 files? How they are created? Is there any quick way to convert them into csv?

psuresh
  • 535
  • 2
  • 13
  • 25
  • wgrib2 utility is the quick way. – Robert Davy Mar 09 '21 at 03:51
  • It is a grib 1 or grib 2 file? grib 1 is an older format so I'm assuming grib 2. [wgrib2](https://www.cpc.ncep.noaa.gov/products/wesley/wgrib2/csv.html) seems to have the functionality of exporting grib2 data as csv files. If wgrib2 doesn't work for you do you have an example of the file you are trying to convert to csv? – Mark Loeffelbein Mar 09 '21 at 16:54

2 Answers2

4
  1. I would recommend to use cfgrib. It is the state of the art python tool using eccodes. It is fully integrated into xarray.

  2. When you open a grib file there are two ways you should know: The first one requires (if several variables in the file) a filter argument:

import xarray

grib_data = xarray.open_dataset('/path/to/your/grib_file.grb', engine='cfgrib', backend_kwargs={'filter_by_keys':{'typeOfLevel': 'heightAboveGround','level': 2}})

To get all variables from the file you can use cfgrib's open_datasets:

import cfgrib
grib_data = cfgrib.open_datasets('/path/to/your/grib_file.grb')

grib_data will be a list of xarray.Dataset.

  1. If you are receiving default parameter names like paramId0 it is likely that you need the right grib tables. These are provided by the weather service who published the grib file and should be placed at ECCODES_DEFINTION_PATH
dl.meteo
  • 1,658
  • 15
  • 25
  • Hi, regarding the third point, I have this issue, some variables are paramId0 or unknown. How is the way to update the gib table? what kind of files are these tables? – Jon Ander May 11 '21 at 12:56
  • 1
    You can find grib tables for German Weather Service here: https://rcc.dwd.de/DE/leistungen/opendata/hilfe.html;jsessionid=862F7268C4FD22AB1F64DE65EF40BE58.live31092?nn=16102&lsbId=625220 You just have to add another path to ECCODES_DEFINTION_PATH as you would do for PYTHONPATH. – dl.meteo May 12 '21 at 07:02
  • I'm working with the cfgrib lib (via xarray) and I realize that eccodes tables are from ECMWF that not correspond with the NOAA grib2 variables. I see that there is another lib for python pywgrib2. I will try with this other option and cross fingers... – Jon Ander May 12 '21 at 09:42
  • You do not need to use another software. You are using the best we have in python so far. Just add the right grib tables from the origin of the data. – dl.meteo May 14 '21 at 10:27
  • 1
    Finally, I added the none existing variables into some eccodes files and works with xarray and cfgrib. I did not find any downloadable tables from the noaa compatible with eccodes table system. – Jon Ander May 14 '21 at 11:16
  • 1
    Well, this is why grib Format does not really work as WMO expected. It should helped transact weather forecast globally on a simple standard but from my point of view it has failes because of such problems. Every authority can write into the files what they want and define this in additional grib tables. If they are not available it will be difficult to obtain data from the file. This is why I prefer to use netcdf4. It has a well structured header with all information. No additional tables required. Best regards ;) – dl.meteo May 15 '21 at 09:30
3

GRIB2 is similarly to GRIB1 divided to messages and each message into sections. There can be more messages in one GRIB file. There are just concatenated after each other. If you would have 2 GRIB2 files and want to merge them a simple use of cat command would suffice.

GRIB2 is described for example here: https://www.yumpu.com/en/document/view/11723135/guide-to-wmo-table-driven-code-forms. A good online sources:

The online sources describe more in detail the parameters of the different sections.

While GRIB2 basic concepts and start of a message is similar to GRIB1, in later sections (1-7) it is rather different. GRIB2 also allows repetition of some sections in one message:

Section 0: Indicator Section
Section 1: Identification Section
Section 2: Local Use Section (optional)                                  |
Section 3: Grid Definition Section                       |               |
Section 4: Product Definition Section    |               |               |
Section 5: Data Representation Section   | (repeated)    | (repeated)    | (repeated)
Section 6: Bit-Map Section               |               |               |
Section 7: Data Section                  |               |               |
Section 8: End Section

Section 0: is always 16 bytes in GRIB2 (8 bytes in GRIB1) and contains the total length of the message (all sections) the grib edition (GRIB1/GRIB2) and in case of GRIB2 the discipline of the message.

In GRIB2 each section (except section 1 and 8) starts with length of the section (4 bytes) and section number (1 byte). So when you read the binary GRIB2 file you can relatively simply separate a GRIB2 message into its sections.

Section 8: always contains 4 bytes, a string: 7777

Classification

Each GRIB file can contain multiple messages. A message is a grid (section 3) product definition (section 4, one of: wind, temperature, relative humidity, ...) and data (section 7). That means that in each file you may have different data (wind speed, air temperature, current direction, ...). If you want to classify those files, you should probably look into the Product Definition Section (Section 4).

Note that one message usually refers to one reference time (time of measurement or creation of the dataset), defined in Identification Section (section 1) and one forecast time (time when the data, e.g. temparature, is valid for), defined in Product Definition Section (Section 4)

Conversion to CSV

You can read the Data Section (Section 7) with the help of Data Representation Section (Section 5) to get the actual data. Using wgrib2 (https://www.cpc.ncep.noaa.gov/products/wesley/wgrib2/) you can easily dump the data into a file. Just note that to know what the data represent and what units they use you need to also take Product Definition Section (Section 4) into account.

Note that the data in Data Section (section 7) are encoded. How they are encoded is described in Data Representation Section (section 5). In some cases there is also a non-empty Bit-Map Section (section 6) which sais on what positions there are valid values. An example may be cloud cover in percent. The Bit-Map section would define points on the grid where there is any cloud cover or not (true/false bit array). The data section than only contains points where there is some cloud cover (bit-map section bit is true) and stores value between 0-100%. This is one of the ways how space can be reduced.

Jan Kubovy
  • 421
  • 5
  • 11