1

I wrote this very simple list of 5 numbers with the following code, compiled with gfortran on a 64 bit Linux machine:

integer, parameter :: binary64  = selected_real_kind(15, 307)
real(kind=binary64) :: zero, one, pi
zero = 0.0_binary64
one  = 1.0_binary64
pi   = 3.141592653589793238_binary64
open(unit=10, file="test", action="write", status="new", form="unformatted")
write(unit=10) zero
write(unit=10) one
write(unit=10) pi
write(unit=10) zero
write(unit=10) one
close(unit=10)

They I can read the file in hexadecimal with:

xxd short.test.file_binary64

That is the output:

0000000: 0800 0000 0000 0000 0000 0000 0800 0000  ................
0000010: 0800 0000 0000 0000 0000 f03f 0800 0000  ...........?....
0000020: 0800 0000 182d 4454 fb21 0940 0800 0000  .....-DT.!.@....
0000030: 0800 0000 0000 0000 0000 0000 0800 0000  ................
0000040: 0800 0000 0000 0000 0000 f03f 0800 0000  ...........?....

So each register has a number: 0, 10, 20, 30, 40. I am not sure what part of the other things correspond to 0,1, pi, 1, 1 and what is the meaning of the 0800 000, and I don't know what the other thing is (a failed representation in ascii?)

And if I do a binary read:

xxd -b short.test.file_binary64

The output is even more cryptic:

0000000: 00001000 00000000 00000000 00000000 00000000 00000000  ......
0000006: 00000000 00000000 00000000 00000000 00000000 00000000  ......
000000c: 00001000 00000000 00000000 00000000 00001000 00000000  ......
0000012: 00000000 00000000 00000000 00000000 00000000 00000000  ......
0000018: 00000000 00000000 11110000 00111111 00001000 00000000  ...?..
000001e: 00000000 00000000 00001000 00000000 00000000 00000000  ......
0000024: 00011000 00101101 01000100 01010100 11111011 00100001  .-DT.!
000002a: 00001001 01000000 00001000 00000000 00000000 00000000  .@....
0000030: 00001000 00000000 00000000 00000000 00000000 00000000  ......
0000036: 00000000 00000000 00000000 00000000 00000000 00000000  ......
000003c: 00001000 00000000 00000000 00000000 00001000 00000000  ......
0000042: 00000000 00000000 00000000 00000000 00000000 00000000  ......
0000048: 00000000 00000000 11110000 00111111 00001000 00000000  ...?..
000004e: 00000000 00000000  

I need to read this data in Python. Hence my three questions, and I believe the first one is very simple:

  1. Is this issue something straightforward and easy and there is just some piece of information I need to learn after doing a reasonable amount of research?

  2. This is a rather complicated issue because binary files are cpu and compiler dependent blablabla and I might spend a week doing research around internet and still not being able to sort it out?

  3. How can I read the data in Python?

Mephisto
  • 640
  • 1
  • 6
  • 12
  • There are lots of related questions which cover various details. Two examples: [1](https://stackoverflow.com/q/37063864); [2](https://stackoverflow.com/q/8131204). Looking instead at simply Fortran aspects will say much, but the same issues will appear with C, C++, etc. – francescalus Apr 26 '17 at 17:05

1 Answers1

1

The format is 32bit integer length, 64bit double value, 32bit integer length for each dataset.

You can decode with struct, eg. the third dataset

import struct
data = b'\x08\x00\x00\x00\x18-DT\xfb!\t@\x08\x00\x00\x00'
l1, value, l2 = struct.unpack("<idi", data)
# (8, 3.141592653589793, 8)

So you need to read your file in blocks of 16 bytes:

with open("short.test.file_binary64", "rb") as binary:
    while True:
        data = binary.read(16)
        if not data:
            break
        l1, value, l2 = struct.unpack("<idi", data)
        print(value)
Daniel
  • 42,087
  • 4
  • 55
  • 81