1

I want to write intermediate files for the Weather Research and Forecasting model with Python. These are binary files read by the model with Fortran. Files include some metadata and a 2d fields. Writing metadata works fine, e.g. by

# record1
g.write(struct.pack('>i', 4)) # record opening bytes
g.write(struct.pack('>i', d.ifv))
g.write(struct.pack('>i', 4)) # record closing bytes`

But when I want to write the field which is later read by some Fortran code like

real, dimension(:,:):: slab
...
read (1) slab

I struggle. I think I need to convert the numpy array to write to np.float32 type and that I have to write it in Fortran order. There is a page where some code to read intermediate files is presented and reading the data field is done there by

def _parse_record5(data, nx, ny):
  result = {}
  size = nx * ny
  fmt = ">{}f".format(size)
  parsed = struct.unpack(fmt, data)
  arr = np.array(parsed, dtype=np.float32)
  result["slab"] = arr.reshape((nx, ny), order="F")
  return result

but I am not able to figure out how to write the field properly. I tried to do it like this:

recl = d.nx*d.ny*4  # number of array elements + 4 bytes
g.write(struct.pack('>i', recl)) # record opening bytes
slab_temp = d.slab.astype(np.float32) # converting from 64 to 32 bytes
slab = slab_temp.tobytes('F')         # bytestream in Fortran order
g.write(struct.pack(str(recl)+'s', slab))  # writing bytestream to file
g.write(struct.pack('>i', recl)) # record closing bytes`

Unfortunately, it does not work properly and data read by the Fortran program are completely wrong. Nevertheless, the length of the bytestream that I have written seam to be ok, because metadata read from the next dataset written in the same way are fine.

What is the correst way to do this ? Your suggestions will be highly appreciated !

Klem
  • 49
  • 1
  • 8
  • 1
    The required data structure for an unformatted Fortran binary file is not standardized. Please tell us about the Fortran compiler used and approximate sizes of the data written. Also, it would be useful if you could give values you think should be read and what you are seeing read. – francescalus May 21 '19 at 15:01
  • 3
    Can you use a standard file format for your intermediate files instead? There may be a little bit of overhead, but since the format would be standardized, there should be good libraries for both Python and Fortran. Given the topic, netCDF4 (or its more generic base HDF5) could be good choices. – 9769953 May 21 '19 at 15:01
  • use stream rather than record based io – ptb May 21 '19 at 17:42
  • 1
    The suggestions to change the Fortran side are not that useful in my opinion. Even though I am a Fortran programmer myself, I certainly wouldn't want to deal with the I/O internals of such a huge code as WRF just for this reason. The upstream wouldn't change anyway, one would have to keep a diverging local branch and keep it updated. – Vladimir F Героям слава May 22 '19 at 06:43
  • @9769953: as Vladimir F commented, it is not a good idea to modifiy the WRF system to read another file type as intermediate file. Intermediate files work quite well and there is only the need to write themself when input data are not in grib format or someone wants to modify the input data. – Klem May 23 '19 at 08:01
  • @francescalus: I thought reading and writting binary files in Fortran is completely defined by endianesss, precision of the data / datatype and order of the array ('C'/'F'), but now I learned that there is a difference between form='binary' and form='unformatted' in the open statement. The model routine in my case uses form='unformatted' – Klem May 23 '19 at 08:07
  • `form='binary'` is not standard Fortran. You can read some details about the structure of records for `form='unformatted'` in questions [like this one](https://stackoverflow.com/q/8751185/3157076). Please do note that the examples there really don't cover everything, especially where four-byte lengths aren't sufficient to describe potential records. – francescalus May 23 '19 at 08:54
  • @francescalus: thanks for clarifying this, helps me a lot ! – Klem May 23 '19 at 10:02

1 Answers1

0

Finally, I was able to solve the problem. I missed to specify the endianess when writing the array to the file (WRF requires big endian but our linux computer runs with little endian). All other variables I already wrote with big endian, so they were read properly.

For completeness, I put the python code to read and write WRF intermediate files below:

# reads and writes dataset from / to a WRF intermediate file
import struct
import numpy as np

class dataset:
  ifv = 0.0
  hdate = ''
  xfcst = 0.0
  map_source = ''
  field = ''
  units_string = ''
  desc = ''
  xlvl = 0.0
  nx = 0
  ny = 0
  iproj = 0
  startloc = ''
  startlat = 0.0
  startlon = 0.0
  deltalat = 0.0
  deltalon = 0.0
  earth_radius = 0.0
  dx = 0.0
  dy = 0.0
  truelat1 = 0.0
  truelat2 = 0.0
  xlonc = 0.0
  nlats = 0.0
  is_wind_earth_rel = None
  slab = None

def print_dataset(res):
  print('ifv: ', res.ifv)
  print('hdate: ', res.hdate)
  print('xfcst: ', res.xfcst)
  print('map_source: ', res.map_source)
  print('field: ', res.field)
  print(' units_string: ', res.units_string)
  print(' desc: ', res.desc)
  print(' xlvl: ', res.xlvl)
  print(' nx: ', res.nx)
  print(' ny: ', res.ny)
  print(' iproj: ', res.iproj)
  print(' startloc: ', res.startloc)
  print(' startlat: ', res.startlat)
  print(' startlon: ', res.startlon)
  print(' deltalat: ', res.deltalat)
  print(' deltalon: ', res.deltalon)
  print(' earth_radius: ', res.earth_radius)
  print(' dx: ', res.dx)
  print(' dy: ', res.dy)
  print(' truelat1: ', res.truelat1)
  print(' truelat2: ', res.truelat2)
  print(' xlonc: ', res.xlonc)
  print(' nlats: ', res.nlats)
  print(' is_wind_earth_rel: ', res.is_wind_earth_rel)
  print(' slab: ', res.slab)

def read_dataset(f):
  d = dataset()
  recl=struct.unpack('>i',f.read(4))[0]
  d.ifv = struct.unpack('>i',f.read(recl))[0]
  recl_close=struct.unpack('>i',f.read(4))[0] # record closing bytes
  recl=struct.unpack('>i',f.read(4))[0]
  record2= struct.unpack(str(recl)+'s',f.read(recl))[0]
  d.hdate = record2[0:24].decode('UTF8')
  d.xfcst = struct.unpack('>f', record2[24:28])[0]
  d.map_source = record2[28:60].decode('UTF8')
  d.field = record2[60:69].decode('UTF8')
  d.units_string = record2[69:94].decode('UTF8')
  d.desc = record2[94:140].decode('UTF8')
  d.xlvl = struct.unpack('>f', record2[140:144])[0]
  d.nx = struct.unpack('>i', record2[144:148])[0]
  d.ny = struct.unpack('>i', record2[148:152])[0]
  d.iproj = struct.unpack('>i', record2[152:156])[0]
  recl_close=struct.unpack('>i',f.read(4))[0] # record closing bytes
  recl=struct.unpack('>i',f.read(4))[0]
  record3= struct.unpack(str(recl)+'s',f.read(recl))[0]
  d.startloc = record3[0:8].decode('UTF8')
  if(d.iproj == 0): # Cylindrical equidistant projection
    d.startlat = struct.unpack('>f', record3[8:12])[0]
    d.startlon = struct.unpack('>f', record3[12:16])[0]
    d.deltalat = struct.unpack('>f', record3[16:20])[0]
    d.deltalon = struct.unpack('>f', record3[20:24])[0]
    d.earth_radius = struct.unpack('>f', record3[24:28])[0]
  if(d.iproj == 1): # Mercator projection
    d.startlat = struct.unpack('>f', record3[8:12])[0]
    d.startlon = struct.unpack('>f', record3[12:16])[0]
    d.dx = struct.unpack('>f', record3[16:20])[0]
    d.dy = struct.unpack('>f', record3[20:24])[0]
    d.truelat1 = struct.unpack('>f', record3[24:28])[0]
    d.earth_radius = struct.unpack('>f', record3[28:32])[0]
  if(d.iproj == 3): # Lambert conformal projection
    d.startlat = struct.unpack('>f', record3[8:12])[0]
    d.startlon = struct.unpack('>f', record3[12:16])[0]
    d.dx = struct.unpack('>f', record3[16:20])[0]
    d.dy = struct.unpack('>f', record3[20:24])[0]
    d.xlonc = struct.unpack('>f', record3[24:28])[0]
    d.truelat1 = struct.unpack('>f', record3[28:32])[0]
    d.truelat2 = struct.unpack('>f', record3[32:36])[0]
    d.earth_radius = struct.unpack('>f', record3[36:40])[0]
  if(d.iproj == 4): # Gaussian projection
    d.startlat = struct.unpack('>f', record3[8:12])[0]
    d.startlon = struct.unpack('>f', record3[12:16])[0]
    d.nlats = struct.unpack('>f', record3[16:20])[0]
    d.deltalon = struct.unpack('>f', record3[20:24])[0]
    d.earth_radius = struct.unpack('>f', record3[24:28])[0]
  if(d.iproj == 5): # Polar-stereographic projection
    d.startlat = struct.unpack('>f', record3[8:12])[0]
    d.startlon = struct.unpack('>f', record3[12:16])[0]
    d.dx = struct.unpack('>f', record3[16:20])[0]
    d.dy = struct.unpack('>f', record3[20:24])[0]
    d.xlonc = struct.unpack('>f', record3[24:28])[0]
    d.truelat1 = struct.unpack('>f', record3[28:32])[0]
    d.earth_radius = struct.unpack('>f', record3[32:36])[0]
  recl_close=struct.unpack('>i',f.read(4))[0] # record closing bytes
  recl=struct.unpack('>i',f.read(4))[0]
  record4= struct.unpack(str(recl)+'s',f.read(recl))[0]
  d.is_wind_earth_rel = struct.unpack('>i', record4[0:4])[0]
  recl_close=struct.unpack('>l',f.read(4))[0] # record closing bytes
  # record 5
  recl=struct.unpack('>i',f.read(4))[0]
  record5 = struct.unpack(str(recl)+'s', f.read(recl))[0]
  slab = np.frombuffer(record5, dtype='>f')
  d.slab = slab.reshape(d.nx,d.ny)
  recl=struct.unpack('>i',f.read(4))[0]
  return d

def write_dataset(g, d):
  # record1
  g.write(struct.pack('>i', 4)) # record opening bytes
  g.write(struct.pack('>i', d.ifv))
  g.write(struct.pack('>i', 4)) # record closing bytes
  # record2
  g.write(struct.pack('>i', 156)) # record opening bytes
  g.write(struct.pack('24s', bytes(d.hdate, 'utf-8')))
  g.write(struct.pack('>f', d.xfcst))
  g.write(struct.pack('32s', bytes(d.map_source, 'utf-8')))
  g.write(struct.pack('9s', bytes(d.field, 'utf-8')))
  g.write(struct.pack('25s', bytes(d.units_string, 'utf-8')))
  g.write(struct.pack('46s', bytes(d.desc, 'utf-8')))
  g.write(struct.pack('>f', d.xlvl))
  g.write(struct.pack('>i', d.nx))
  g.write(struct.pack('>i', d.ny))
  g.write(struct.pack('>i', d.iproj))
  g.write(struct.pack('>i', 156)) # record closing bytes
  # record3
  if(d.iproj == 0): # Cylindrical equidistant projection
    g.write(struct.pack('>i', 28)) # record opening bytes
    g.write(struct.pack('8s', bytes(d.startloc, 'utf-8')))
    g.write(struct.pack('>f', d.startlat))
    g.write(struct.pack('>f', d.startlon))
    g.write(struct.pack('>f', d.deltalat))
    g.write(struct.pack('>f', d.deltalon))
    g.write(struct.pack('>f', d.earth_radius))
    g.write(struct.pack('>i', 28)) # record closing bytes
  if(d.iproj == 1): # Mercator projection
    g.write(struct.pack('>i', 32)) # record opening bytes
    g.write(struct.pack('8s', bytes(d.startloc, 'utf-8')))
    g.write(struct.pack('>f', d.startlat))
    g.write(struct.pack('>f', d.startlon))
    g.write(struct.pack('>f', d.dx))
    g.write(struct.pack('>f', d.dy))
    g.write(struct.pack('>f', d.truelat1))
    g.write(struct.pack('>f', d.earth_radius))
    g.write(struct.pack('>i', 32)) # record closing bytes
  if(d.iproj == 3): # Lambert conformal projection
    g.write(struct.pack('>i', 40)) # record opening bytes
    g.write(struct.pack('8s', bytes(d.startloc, 'utf-8')))
    g.write(struct.pack('>f', d.startlat))
    g.write(struct.pack('>f', d.startlon))
    g.write(struct.pack('>f', d.dx))
    g.write(struct.pack('>f', d.dy))
    g.write(struct.pack('>f', d.xlonc))
    g.write(struct.pack('>f', d.truelat1))
    g.write(struct.pack('>f', d.truelat2))
    g.write(struct.pack('>f', d.earth_radius))
    g.write(struct.pack('>i', 40)) # record closing bytes
  if(d.iproj == 4): # Gaussian projection
    g.write(struct.pack('>i', 28)) # record opening bytes
    g.write(struct.pack('8s', bytes(d.startloc, 'utf-8')))
    g.write(struct.pack('>f', d.startlat))
    g.write(struct.pack('>f', d.startlon))
    g.write(struct.pack('>f', d.nlats))
    g.write(struct.pack('>f', d.deltalon))
    g.write(struct.pack('>f', d.earth_radius))
    g.write(struct.pack('>i', 28)) # record closing bytes
  if(d.iproj == 5): # Polar-stereographic projection
    g.write(struct.pack('>i', 36)) # record opening bytes
    g.write(struct.pack('8s', bytes(d.startloc, 'utf-8')))
    g.write(struct.pack('>f', d.startlat))
    g.write(struct.pack('>f', d.startlon))
    g.write(struct.pack('>f', d.dx))
    g.write(struct.pack('>f', d.dy))
    g.write(struct.pack('>f', d.xlonc))
    g.write(struct.pack('>f', d.truelat1))
    g.write(struct.pack('>f', d.earth_radius))
    g.write(struct.pack('>i', 36)) # record closing bytes
  # record4
  g.write(struct.pack('>i', 4)) # record opening bytes
  g.write(struct.pack('>i', d.is_wind_earth_rel))
  g.write(struct.pack('>i', 4)) # record closing bytes
  # record 5
  recl = d.nx*d.ny*4
  g.write(struct.pack('>i', recl)) # record opening bytes
  slab_temp = d.slab.astype('>f')
  g.write(slab_temp.tobytes(order='F'))
  g.write(struct.pack('>i', recl)) # record closing bytes
Klem
  • 49
  • 1
  • 8