Reading fortran direct access data and writing formatted data - faster with python than with fortran?

Question

Good evening,

I have a simulation written in Fortran that produces large files of unformatted (direct access) data. From some of these files I want to produce ascii human-readable files.

For some reason this (in python):

f = open(filename,'rb')
for i in xrange(0,N):
    pos = i * 64
    f.seek(pos)
    name = struct.unpack('ffff',f.read(16))
    print name[0],name[1],name[2],name[3]

takes only ~4 seconds (piping the output into a file on the shell) while this (in Fortran)

 open (1,file=inputfile,access='direct',recl=64, action='read',status="OLD")
 open (2, file=outputfile, access="sequential", action="write",status="REPLACE")
 do i=1,(N)
     read(1, rec = i ) a,b,c,d
     write(2,*) a,b,c,d
 enddo

takes ~ 20 seconds. What am I doing wrong? Is there a faster way of doing this in Fortran?

Best regards! rer

Try writing to standard output with the fortran program and piping to the output file. — Tarik, Aug 11 '13 at 20:40
thx - ok, I tried that, but it didn't change the time it needs — rer, Aug 11 '13 at 21:43
I think, the reason for the slow writing speed is the way, Fortran handles its output units. [Here](https://www.ibm.com/developerworks/mydeveloperworks/blogs/b10932b4-0edd-4e61-89f2-6e478ccba9aa/entry/improving_i_o_performance_in_xl_fortran3?lang=en) is a description, what IBM's XLF is doing (tl;dr: prep,lock,write,cleanup,unlock). You could try to merge several read/write statements by unfolding the loop manually or simply store larger chunks of data. In your case, the overhead seems to be the performance killer. — Stefan, Aug 12 '13 at 13:59
read into a large array and write the whole thing with a single statement — agentp, Aug 12 '13 at 14:05
I think @Stefan and george have a point. You could pre-allocate a large string then print on it using a fixed length format and finally output the whole string in one shot. Of course, you could do a chunk at a time (in the order of several thousand numbers) — Tarik, Aug 12 '13 at 16:02
There is also an option to do buffered IO operations. Maybe this would help, too. — Stefan, Aug 16 '13 at 13:54

Stefan · Answer 1 · 2013-08-19T13:47:35.103

DISCLAIMER: I don't know, if this solves the problem, but I know, that I can get time differences by a factor of up to 20. I also did only test the output of data and didn't read it.

I was investigating the interaction of Fortran with python and as such wanted to know, how Fortran's binary files are build. While doing this, I noticed, that both ifort and gfortran have an option to switch buffered IO on or off.

ifort: You can specify the keyword BUFFERED=['YES'|'NO'] while opening a file.

gfortran: You can set the environmental variable GFORTRAN_UNBUFFERED_ALL to y|Y|1 or n|N|0 for unbuffered and buffered IO, respectively.

Please note, that gfortran does buffer IO by default, while ifort does not.

My sample code at the bottom results in the following times:

        |buffered|unbuffered
--------+--------+----------
ifort   |   1.9s |  18.2s
gfortran|   2.4s |  37.5s

This sample code writes a direct access binary file with 10M datasets of 12 bytes each.

PROGRAM btest
IMPLICIT NONE

INTEGER :: i

! IFORT
OPEN(11,FILE="test_d.bin",ACCESS="DIRECT",FORM="UNFORMATTED",RECL=3, &
& STATUS="REPLACE",BUFFERED="NO") ! ifort defines RECL as words
! GFORTRAN
!OPEN(11,FILE="test_d.bin",ACCESS="DIRECT",FORM="UNFORMATTED",RECL=12, &
!& STATUS="REPLACE") ! gfortran defines RECL as bytes

DO i = 1, 10000000
    WRITE(11,REC=i) i,i*1._8
END DO

CLOSE(11)

END PROGRAM

You can just set `FORT_BUFFERED = true` or switch `-assume buffered_io` for ifort. Also, you can determine `RECL` using the `INQUIRE` statement. — Vladimir F Героям слава, Aug 19 '13 at 14:38

score 0 · Answer 2 · answered Aug 12 '13 at 02:15

0

Try using StreamIO see http://www.star.le.ac.uk/~cgp/streamIO.html That should allow random access without fixed record size and will probably result in using the same underlying O.S. system calls, thereby hopefully getting the same performance.

answered Aug 12 '13 at 02:15

Tarik

10,810
2
26
40

Thx! I tried `ACCESS="STREAM", FORM="FORMATTED"` and wrote with `write(2,"(4(ES14.6E2,X))") a,b,c,d` but there was no improvement in performance. Even if I reduce the number of digits printed below the number of digits which the python output generates it's much slower. I also verified that reading the data is very fast, thus it's the printing that takes quite a bit. – rer Aug 12 '13 at 07:16
1

I suspect the reading to be slow an this is where I would like you to try StreamIO. To confirm my suspicion, comment out the write statement. Just read without writing. – Tarik Aug 12 '13 at 07:21
That's exactly what I did to verify which process takes more time. It reads the file in about ~3 seconds. To make sure that my compiler doesn't skip the reading loop as the data read is not used, I printed the last entries to the command line. – rer Aug 12 '13 at 13:39

Reading fortran direct access data and writing formatted data - faster with python than with fortran?

2 Answers2

Linked