I want to minimize the size of output files in FORTRAN without losing any data. To find the best method for doing so I wrote the program:
program test
character(len=255) format
1 format(9i3)
c FORMATTED
open(99,file='form1.txt',form='formatted')
do i=1,1
write(99,1) 1, 2, 3, 4, 5, 6, 7, 8, 9
enddo
close(99)
c UNFORMATTED
open(98,file='form2.txt',form='unformatted')
do i=1,1
write(98) 1, 2, 3, 4, 5, 6, 7, 8, 9
enddo
close(98)
c DIRECT ACCESS
nrec=sizeof(i)*9
open(97,file='form3.txt',form='unformatted',
& access='direct',recl=nrec)
do i=1,1
write(97,rec=i) 1, 2, 3, 4, 5, 6, 7, 8, 9
enddo
close(97)
call system('ls -lh form?.txt')
end
This will create three files with one record each. The output of this program is:
-rw-r--r--. 1 user users 28 May 27 17:10 form1.txt
-rw-r--r--. 1 user users 44 May 27 17:10 form2.txt
-rw-r--r--. 1 user users 36 May 27 17:10 form3.txt
From Oracle's website:
If FORM='UNFORMATTED', each record is preceded and terminated with an INTEGER*4 count, making each record 8 characters longer than normal. This convention is not shared with other languages, so it is useful only for communicating between FORTRAN programs.
My questions are:
- Why there is a difference of 16 bytes (not 8 bytes as mentioned in previous quote) between
form1.txt
andform2.txt
? Note that the size offile1.txt
depends on the format (e.g. if I change the lineformat(9i3)
toformat(9i4)
the file size offile1.txt
increases by 9 bytes).
and my main question is:
- I have big data files (greater than 100G) with five columns and millions of rows. What is the best method in FORTRAN to reduce the size of my output files (perhaps writing in binary form)?
A similar question to mine is: Best way to write a large array to file in fortran? Text vs Other