0

We have a small Fortran program based on the Intel compiler that outputs binary data. I am trying to refactor that code to .NET. I researched how the binary writing in Fortran works but am a bit lost. Using the same input we do not get same binary output. What am I missing?

The Fortran program essentially looks like this:

CHARACTER * 3  dept(400)
CHARACTER * 4  code(30)
CHARACTER * 7  zdate(400)
REAL           wt(400), WT99(400), WT90(400), WT50(400) 
Integer        ops(400), J, K

OPEN( 7 , FILE = 'C\stats.BIN',ACCESS='SEQUENTIAL',FORM='UNFORMATTED')

for each record
    Write(7) code(I) , J , ( dept(K) , wt(K) , ops(K) , zdate(K) ,  WT99(K) , WT90(K) , WT50(K) , K = 1 , J )
next record

and my attempt at a .NET re-write

Public Class CityStats
    Public Property Dept As String
    Public Property Weight As Integer
    Public Property NumOps As Integer
    Public Property LastOpDate As DateTime
    Public Property CheckDate As DateTime
    Public Property 99Percentile As Double
    Public Property 90Percentile As Double
    Public Property 50Percentile As Double
End Class


Dim city As String = "BB", xcode as Integer = 1

Dim str As Stream = File.Open"C:\stats.BIN", FileMode.Create)
Using bw As BinaryWriter = New BinaryWriter(str)
      Dim sb As New StringBuilder(4, 4)
      sb.Append(city & xcode.ToString("00"))
      bw.Write(sb.ToString)
      bw.Write(cityStats.Count)

      'cycle through all the records
      For Each out In CityStats
          bw.Write(out.Dept)
          bw.Write(out.Weight)
          bw.Write(out.NumOps)
          bw.Write(out.CheckDate.ToString("ddMMMyy").ToUpper)
          bw.Write(out.99Percentile)
          bw.Write(out.90Percentile)
          bw.Write(out.50Percentile)
      Next out
 End Using
sinDizzy
  • 1,300
  • 7
  • 28
  • 60
  • Note that `( dept(K) , wt(K) , ops(K) , zdate(K) , WT99(K) , WT90(K) , WT50(K) , K = 1 , J )` is an implicit loop for K=1 to J. See [this](https://pages.mtu.edu/~shene/COURSES/cs201/NOTES/chap08/io.html). –  Aug 19 '20 at 05:07
  • Do you have a folder called C? If not, you're missing a colon after C – cup Aug 19 '20 at 05:48
  • Got it, thus my loop on "For Each out In CityStats". I also fixed the path in the .NET code. – sinDizzy Aug 19 '20 at 16:22
  • Why would you expect the output to be identical? You appear not to be using the same output library or data record specifications in the two versions. – francescalus Aug 19 '20 at 16:35
  • @francescalus that is why I am asking for help. I included my attempt and am asking for help on how to go about replicating the results. – sinDizzy Aug 19 '20 at 16:40
  • If what you want is to reproduce the output exactly with the new program, then you're going to have to understand exactly what the Intel Fortran IO runtime is doing/reverse engineer it. (Which is the part you are missing: just dumping out values isn't the same thing as creating the structured file that the Fortran compiler creates. I don't know .NET at all well enough to say whether your program is trying to mimic that.) To add to your problems, note that there's no reason to expect the Fortran program to give exactly the same output if ran twice. – francescalus Aug 19 '20 at 18:15
  • Yes I get all that. That is why I am here. To see if someone can give me guidance on the Fortran process as I am not too familiar with how the binary output in and of itself works. – sinDizzy Aug 21 '20 at 22:32
  • ACCESS='SEQUENTIAL', FORM='UNFORMATTED' is not the same as BinaryWriter.Write. It is similar but not identical: https://scc.ustc.edu.cn/zlsc/tc4600/intel/2015.1.133/compiler_f/GUID-57E3A72A-38A8-41FC-AEF5-1AD916C03D79.htm – Rob Aug 23 '20 at 19:16

1 Answers1

0

OK I got sidetracked but did eventually find a solution for my problem. In the end, the key is understanding the data types and how they map between FORTRAN and .NET. If we take my example above then I updated my code to replicate the binary file generated by the FORTRAN program.

I verified by taking both binary files and running them though an F90 binary reader. All data was comparable. That is to say, strings and integers were exact and singles were fairly close. For this process, the single data was within a tenth and that was good enough for me since some of the data is rounded.

Public Class CityStats
    Public Property Dept As String
    Public Property Weight As Integer
    Public Property NumOps As Integer
    Public Property LastOpDate As DateTime
    Public Property CheckDate As DateTime
    Public Property 99Percentile As Double
    Public Property 90Percentile As Double
    Public Property 50Percentile As Double
End Class


Dim city As String = "BB", xcode as Integer = 1

'open a main BinaryWriter
Dim str As Stream = File.Open"C:\stats.BIN", FileMode.Create)
Using bw As BinaryWriter = New BinaryWriter(str)
      'write to a temp BinaryWriter so we can determine the record length before we write it to the main BinaryWriter.
      Dim msRec As New MemoryStream
      Dim bwRec As New BinaryWriter(msRec)

      Dim cx As String = F90FixedString(city & xcode.ToString("00"), 3)
      bwRec.Write(cx.ToCharArray)   'see note below about strings vs char arrays
      bwRec.Write(cityStats.Count)

      'cycle through all the records for one X code and write out to binary. in this case a FORTRAN REAL defaults
      'to a 4 byte single precision float or in .NET it is a type Single. Also we do not write out a string as the
      'BinaryWriter will prepend it with the string length. instead we convert the string to an array of characters.
      'The Fortran Integer is a 4-byte signed integer by default so that is equivalent to the .NET Integer type.
      For Each out In CityStats
          bwRec.Write(out.Dept.ToCharArray)
          bwRec.Write(out.Weight)
          bwRec.Write(out.NumOps)
          bwRec.Write(out.CheckDate.ToString("ddMMMyy").ToUpper.ToCharArray)
          bwRec.Write(CSng(out.99Percentile))
          bwRec.Write(CSng(out.90Percentile))
          bwRec.Write(CSng(out.50Percentile))
      Next out

      'get record length in bytes. Note: BaseStream.Length is 64-bit but my data is rather small so am confident I will never get to > 32-bit
      Dim recLenInBytes As Integer = CInt(bwRec.BaseStream.Length)

      'now write the temp BinaryWriter to our main BinaryWriter. preceded and followed by a 32-bit integer
      'containing the record length.
      bw.Write(recLenInBytes)
      bw.Write(msRec.ToArray)
      bw.Write(recLenInBytes)
 End Using


'a fixed length string in FORTRAN is left-justified and by default conisist of spaces.
Public Function F90FixedString(ByVal inp As String, ByVal strLen As Integer) As String
    'checks
    If inp Is Nothing Then Throw New Exception("Null string. Can't create a fixed length string.")

    'if the string is already the required length then just return the input string
    If inp.Length = strLen Then Return inp

    'pad the string to the right with spaces if required
    Dim out As String = inp.PadRight(strLen, " "c)

    'now check to see if the new string is longer than required and if so then we take the
    'proper amount of characters starting from the left.
    If out.Length > strLen Then out = out.Substring(0, strLen)

    'return the new string
    Return out
End Function

Information was a bit tough to find and some of it was outdated. With these links and some trial and error it all worked out. I'm using the Lahey F90 compiler so your results may be different based on your particular compiler and settings.

Opening Binary Files in Fortran: Status, Form, Access https://software.intel.com/content/www/us/en/develop/documentation/fortran-compiler-developer-guide-and-reference/top/compiler-reference/data-and-i-o/fortran-i-o/record-length.html

Unformatted files (FORM= 'UNFORMATTED'): Specify the record length in 4-byte units, unless you specify the assume byterecl compiler option to request 1-byte units.

https://community.intel.com/t5/Intel-Fortran-Compiler/Convert-REAL-8-Unformatted-Sequential-file/td-p/771973

The Visual Fortran SEQUENTIAL UNFORMATTED file layout is one that is very common on UNIX systems, including DIGITAL UNIX. Each Fortran "record" is preceded and followed by a 32-bit integer containing the record length (the integer is stored in the "little-endian" layout, with the least-significant bit in the lowest addressed byte.) This record length is required to apply Fortran record semantics, including the ability to use BACKSPACE. Unfortunately, this is not correctly documented in the Programmer's Guide - the layout described there is actually that of Microsoft Fortran PowerStation. This will be corrected in future editions.

Why does BinaryWriter prepend gibberish to the start of a stream? How do you avoid it? https://software.intel.com/content/www/us/en/develop/documentation/fortran-compiler-developer-guide-and-reference/top/compiler-reference/data-and-i-o/fortran-i-o/record-length.html

.NET BinaryWriter.Write() Method -- Writing Multiple Datatypes Simultaneously https://learn.microsoft.com/en-us/dotnet/api/system.io.binarywriter.write?view=netframework-4.7

sinDizzy
  • 1,300
  • 7
  • 28
  • 60