3

I have some single-precision little-endian unformatted data files written by Fortran77. I am reading these files using Python using the following commands:

import numpy as np
original_data = np.dtype('float32')
f = open(file_name,'rb')                                                                                                 
original_data = np.fromfile(f,dtype='float32',count=-1)                                                                            
f.close()

After some data manipulation in Python, I (am trying to) write them back in the original format using Python using the following commands:

out_file = open(output_file,"wb")                                                                                             
s = struct.pack('f'*len(manipulated_data), *manipulated_data)                                                                     
out_file.write(s)
out_file.close()

But it doesn't seem to be working. Any ideas what is the right way of writing the data using Python back in the original fortran unformatted format?

Details of the problem:

I am able to read the final file with manipulated data from Fortran. However, I want to visualize these data using a software (Paraview). For this I convert the unformatted data files in the *h5 format. I am able to convert both the original and manipulated data in h5 format using h5 utilities. But while Paraview is able to read the *h5 files created from original data, Paraview is not able to read the *h5 files created from the manipulated data. I am guessing something is being lost in translation.

This is how I am opening the file written by Python in Fortran (single precision data):

open (in_file_id,FILE=in_file,form='unformatted',access='direct',recl=4*n*n*n)

And this is I am writing the original unformatted data by Fortran:

open(out_file_id,FILE=out_file,form="unformatted")

Is this information sufficient?

askewchan
  • 45,161
  • 17
  • 118
  • 134
Pradeep Kumar Jha
  • 877
  • 4
  • 15
  • 30
  • Maybe this helps: http://www.scipy.org/Cookbook/FortranIO – Janne Karila Feb 20 '13 at 18:22
  • 2
    Why not save it to hdf5 or vtk directly from Fortran? – Vladimir F Героям слава Feb 20 '13 at 18:26
  • See also this SO question: [Fortran unformatted file format](http://stackoverflow.com/q/8751185/222914) – Janne Karila Feb 20 '13 at 18:26
  • first thing id do (besides not using unformatted fortran at all) is go through the whole process without any data maniputation. You should end up with the exact same file, or if not you can track down what the difference is. – agentp Feb 21 '13 at 19:47
  • on further look your input code is not properly reading unformatted fortran (You do literally mean a file opened with form=unformatted right?). The first field in an unformatted file should be int32. Your code gives garbage for the first array element, followed by the data. – agentp Feb 21 '13 at 20:08
  • thanks for the comment geroge. but I followed your suggestions and rewrote the file without any modifications using the same command and the final file is exactly the same as original. The diff command returns nothing. So I guess my mistake lies somewhere else. – Pradeep Kumar Jha Feb 22 '13 at 04:37
  • can you show the open/write statements from the fortran code? – agentp Feb 22 '13 at 14:28
  • `f = open(input_file,'rb') data = np.fromfile(f,dtype='float32',count=-1) f.close()` – Pradeep Kumar Jha Feb 23 '13 at 03:08
  • out_file = open(output_file,"wb") s = struct.pack('f'*len(data), *data) out_file.write(s) out_file.close() – Pradeep Kumar Jha Feb 23 '13 at 03:08
  • sorry I am not able to format my code inside the comments. – Pradeep Kumar Jha Feb 23 '13 at 03:09
  • thats python, i was asking about the Fortran code, which will be open(unit=number,...) .. write(number)variable BTW you should edit your question if you have stuff to add. – agentp Feb 23 '13 at 14:53
  • ...and in case you don't have access to the fortran the next question will be how do you know its unformatted? – agentp Feb 23 '13 at 15:00

2 Answers2

2

Have you tried using the .tofile method of the manipulated data array? It will write the array in C order but is capable of writing plain binary.

The documentation for .tofile also suggests this is the same as:

with open(outfile, 'wb') as fout:
    fout.write(manipulated_data.tostring())
ajdawson
  • 3,183
  • 27
  • 36
2

this is creating an unformatted sequential access file:

open(out_file_id,FILE=out_file,form="unformatted")

Assuming you are writing a single array real a(n,n,n) using simply write(out_file_id)a you should see a file size 4*n^3+8 bytes. The extra 8 bytes being a 4 byte integer (=4n^3) repeated at the start and end of the record.

the second form:

open (in_file_id,FILE=in_file,form='unformatted',access='direct',recl=4*n*n*n)

opens direct acess, which does not have those headers. For writing now you'd have write(unit,rec=1)a. If you read your sequential access file using direct acess it will read without error but you'll get that integer header read as a float (garbage) as the (1,1,1) array value, then everything else is shifted. You say you can read with fortran ,but are you looking to see that you are really reading what you expect?

The best fix to this is to fix your original fortran code to use unformatted,direct access for both reading and writing. This gives you an 'ordinary' raw binary file, no headers.

Alternately in your python you need to first read that 4 byte integer, then your data. On output you could put the integer headers back or not depending on what your paraview filter is expecting.

---------- here is python to read/modify/write an unformatted sequential fortran file containing a single record:

import struct
import numpy as np
f=open('infile','rb')
recl=struct.unpack('i',f.read(4))[0]
numval=recl/np.dtype('float32').itemsize
data=np.fromfile(f,dtype='float32',count=numval)
endrec=struct.unpack('i',f.read(4))[0]
if endrec is not recl: print "error unexpected end rec"
f.close()
f=open('outfile') 
f.write(struct.pack('i',recl))
for i in range(0,len(data)):data[i] = data[i]**2  #example data modification
data.tofile(f)
f.write(struct.pack('i',recl)

just loop for multiple records.. note that the data here is read as a vector and assumed to be all floats. Of course you need to know the actuall data type to make use if it.. Also be aware you may need to deal with byte order issues depending on platform.

agentp
  • 6,849
  • 2
  • 19
  • 37