0

I am having some problems understanding the formatting of binary files that I am writing using Fortran. I use the following subroutine to write binary files to disk:

SUBROUTINE write_field(d,m,outfile)

    IMPLICIT NONE    
    REAL, INTENT(IN) :: d(m,m,m)
    INTEGER, INTENT(IN) :: m
    CHARACTER(len=256), INTENT(IN) :: outfile

    OPEN(7,file=outfile,form='unformatted',access='stream')
    WRITE(7) d
    CLOSE(7)

END SUBROUTINE write_field

My understanding of the access=stream option was that this would suppress the standard header and footer that comes with a Fortran binary (see Fortran unformatted file format).

If I write a file with m=512 then my expectation is that the file should be 4 x 512^3 bytes = 536870912 bytes ~ 513 Mb however they are in fact 8 bytes longer than this, coming in at 536870920 bytes. My guess is that these extra bytes are the 4 byte header and footers, which I had wanted to suppress by using access='stream'.

The situation becomes confusing to me if I write a file with m=1024 then my expectation is that the file should be 4 x 1024^3 bytes = 4294967296 ~ 4.1 Gb however they are in fact 24(!) bytes longer than this, coming in at 4294967320 bytes. I do not understand why there are 24 extra bytes here, which would seem to correspond to 6(!) headers or footers.

My questions are:

(a) Is it possible to get Fortran to write a binary with no headers or footers?

(b) If the answer to (a) is 'no' then can I ensure that the larger binary has the same header and footer structure as the smaller binary?

(c) If the answers to (a) and (b) are both 'no' then how do I understand where these extra headers and footers are in the file.

I am using ifort (version 14.0.2) and I am writing the binary files on a small Linux cluster.

UPDATE: When running the same code with OSx and compiled with gfortran 7.3.0 the binary files come out with the expected sizes, as in they are always 4 x m^3 bytes, even when m=1024. So this problem seems to be related to the older compiler.

UPDATE: In fact, the problem is only present when using ifort 14.0.2 I have updated the text to reflect this.

Mead
  • 383
  • 4
  • 19
  • Does it work as expected for smaller values of m? – ptb May 04 '18 at 18:57
  • @ptb I think the file size for smaller values of ```m``` is always ```m^3+8```, and then it switches to ```m^3+24``` somewhere between ```m=512``` and ```m=1024```. – Mead May 04 '18 at 19:05
  • @VladimirF I don't know where the extra bytes are exactly. I have not tried to write a similar file from C. – Mead May 04 '18 at 19:07
  • Just out of curiosity, what happens if you add a `status='replace'` to the `open` command and remove the `form` – ptb May 04 '18 at 19:19
  • @VladimirF Maybe this is more relevant: https://stackoverflow.com/questions/15608421/inconsistent-record-marker-while-reading-fortran-unformatted-file I can't help but think that it has something to do with the ~2Gb limit for Fortran binary files. – Mead May 04 '18 at 19:20
  • first thing i would do is write a really small file and look at it with a hex editor.. – agentp May 04 '18 at 19:20
  • the large file issue is moot as with stream access there are no record markers. – agentp May 04 '18 at 19:21
  • I'm afraid I can't reproduce this. I get precisely the number of bytes expected for m=512 using gfortran 7.3.1. – ptb May 04 '18 at 19:31
  • @ptb I get the same with 7.3.0. See my update. I guess this problem is related to the older version of gfortran (and ifort) that I was using. – Mead May 04 '18 at 19:33
  • Some older versions of ifort had bugs in stream, unformatted writes of large data sizes. Try a newer version. – Steve Lionel May 06 '18 at 00:12
  • If this is a confirmed bug someone should post that as an answer – agentp May 07 '18 at 11:43
  • have you looked at you output file? The additional bytes may just be file size given on line 1. Just a suggestion. – Natsfan May 09 '18 at 04:24

1 Answers1

3

This problem is solved by adding status='replace' in the Fortran open command. It is not to do with the compiler.

With access='stream' and without status='replace', the old binary file is not automatically replaced by the new binary file and is simply overwritten up to a certain point (https://software.intel.com/en-us/forums/intel-fortran-compiler-for-linux-and-mac-os-x/topic/676047). This results in the old binary simply having bytes replaced up to the size of the new binary, while leaving any additional bytes, and the file size, unchanged. This is a problem if the new file size is smaller than the old file size. The problem difficult to diagnose because the time-stamp on the file is updated, so the file looks like it is new when queried using ls -l.

A minimal working example that recreates this problem is as follows:

PROGRAM write_binary_test_minimal

    IMPLICIT NONE
    REAL :: a

    a=1.

    OPEN(7,file='test',form='unformatted')
    WRITE(7) a
    CLOSE(7)

    OPEN(7,file='test',form='unformatted',access='stream')
    WRITE(7) a
    CLOSE(7)

END PROGRAM write_binary_test_minimal

The first write generates a file 'test' of size 8 + 4 = 12 bytes. Where the 8 is the standard Fortran-binary header and footer and the 4 is the size in bytes of a. In the second write statement, even though access='stream' has been set, only the first 4 bytes of the previously-generated 'test' are overwritten, leaving the file as size 12 bytes! The solution to this is to change the second write statement to

OPEN(7,file='test',form='unformatted',access='stream',status='replace')

with an explicit status='replace' to ensure the old file is replaced.

Mead
  • 383
  • 4
  • 19
  • If I run the code as I've written it above, compiled with `gfortran 8.1.0` I generate a file of size 12 bytes that is not replaced by the second write statement. I think this is not just an old version of `ifort` thing. – Mead May 29 '18 at 12:55
  • Sorry, yes, I did not realize you are effectively opening the file for reading and writing and that for a sequential file it does not happen, but in a stream file you can read and write at any position with `pos=` so you cannot have the rest deleted automatically. – Vladimir F Героям слава May 29 '18 at 14:20