1

I need to read a Cobol file into VB.net. Here is the description of the data types from the documentation:

All Magnetic tape files are recorded in 9-track, 8OOBPI mode with odd parity. They are created IBM equipment disk operating system.  IBM System 360 Standard. 

Binary - Data is coded in pure binary code.

BCD - Data is coded in binary coded decimal format. (Primarily
for files created by the IBM 1401 System).

EBCDIC - Data is coded in extended binary coded decimal interchange code. :(An IBM developed code.)

Packed - Data is coded in packed decimal format. 

File Format:
1-2 Record Count [Numeric] (Binary)
3-4 Filler (Binary)
5-5 Record Type [B or R] (EBCDIC)
6-10 Sales Location Numeric [9 digit number] (Packed)
11-13 Sales Identifier (3 character Alpha) (EBCDIC]
etc

So, I know I should read the entire file into a byte array and that's about the limit of what I know to do... A) I saw another post on EBCDIC conversation using

System.Text.Encoding.GetEncoding(37) 

but it is for an entire file. If I run the whole file through it I see intelligible text, but of course the other fields are junk. I don't know the language to decode a single field properly. B) I have no idea what to do with PURE Binary format. C) I don't know how to read Packed, particularly as a single field

I've tried a variety of decoding options for PURE BINARY, but the number I get for the first field is not consistent with the stated length of the rows in the docs.

airlineguy
  • 11
  • 1
  • 1
    You might find https://stackoverflow.com/questions/2858202/how-to-convert-from-ebcdic-to-ascii-in-c-net to be useful, even though it's C# rather than VB. – Jeff Zeitlin Jun 17 '21 at 13:54
  • Yup that helps. The Pure Binary one is killing me. – airlineguy Jun 17 '21 at 15:11
  • PACKED is BCD, but with two digits per byte (i.e., 4 bits per digit). I don't recall whether BCD was big-endian or little-endian. – Jeff Zeitlin Jun 17 '21 at 15:55
  • Do you know how to decode BCD in terms of the system.text.encoding option? – airlineguy Jun 17 '21 at 16:10
  • It's not an encoding; you'll have to process it manually. – Jeff Zeitlin Jun 17 '21 at 16:15
  • There is a description of PACKED DECIMAL format (also called COMP-3 in COBOL) at http://www.simotime.com/datapk01.htm – Jeff Zeitlin Jun 17 '21 at 16:27
  • @JeffZeitlin that site is correct but incomplete. x'A' and x'E' are _valid but not preferred_ positive sign nibbles. x'B' is a _valid but not preferred_ negative sign nibble. This is documented in the Decimal Instructions chapter of the z/Architecture Principles of Operation. – cschneid Jun 17 '21 at 18:06
  • @cschneid - It's likely that a COBOL program with USAGE IS COMP-3 would not use those values for the sign nybble. Good to know just how far out of date I am, though... :) – Jeff Zeitlin Jun 17 '21 at 18:14
  • Thanks jeff, any ideas on the PURE BINARY? I can't continue without getting the record count to know where the row stops at. – airlineguy Jun 17 '21 at 18:55
  • Binary would just be a 16-bit [unsigned?] integer (SHORT or USHORT), but watch out for big-endian vs little-endian. – Jeff Zeitlin Jun 18 '21 at 11:13
  • BIG-ENDIAN! That was it. I used this bigEndian = BitConverter.ToInt16(Byte2.Reverse.ToArray, 0) For some reason there are giant records in the middle of the file. I have no idea what they are, but finding this revealed the issue. – airlineguy Jun 18 '21 at 15:14

1 Answers1

0

Packed decimal format:

For s9(5)V9(4) comp-3, 123.45 is represented in byte format as

      00 12 34 50 0c

Each digit is represented by 4 bits, there is a 4 bit sign (c) at the end and an assumed decimal after the 3.

Most languages provide a routine for converting byte/bytes into a string i.e. byte x'34' -->> String '34'. So you can:

  1. Convert the bytes to a String representation
  2. Add the decimal point in
  3. Strip off the sign character from the end and add the appropriate sign to the front

There are other ways:


Other fields

  • The first field (binary) might be a big endian binary integer or another packed-decimal. There is probably a utility built in the .net to do this.
  • Convert the character fields from ebcdic to ascii one field at a time

In VBA you did not need to read the whole file in, you could read it record by record. I would presume you can do the same in vb.net


Useful Utilities

These tools might be useful for testing.

  • The RecordEditor should be able to display the file. The Layout Wizard should be able determine the format of the file. Alternatively use the Cobol copybook below

  • The Java program CobolToCsv should be able to convert the file to Csv

          01  tape-record.
              05 record-count          pic s9(3) comp.
              05 filler                pic x(2).
              05 record-type           pic x.
              05 Sales-Location        pic s9(9) comp-3.
              05 Sales-Identifier      pic x(3).
    
Bruce Martin
  • 10,358
  • 1
  • 27
  • 38