-1

I have a cobol file which is not in human-readable format: it has data(numbers) in comp-3 format, but also other strings and characters. I have the algorithm for converting the comp-3 format, but when I apply it to the array of bytes from the file, all the characters get converted and the output is not the right one. How could I decode the entire file correctly, taking into consideration both comp-3 data and normal data?

I will add some lines from the file below and also my algorithm for comp-3:

The file in Notepad++ looks like this(first lines):

AH4820øêæÉ* 200 DBAG DBAG 0
AED S EUR AED S KAS°ê¤ 2    ø TN Øê¤ ð §É! SN ê¤

The file in Notepad++ with the ASCII to HEX converted looks like this, even though this should not be right:

200F41483438323002C3B8C3AA01C3A6 01C3892A202020202020202020202020 20203230302044424147204442414720 30202020202020202020202020202020

   public static String unpackData(byte[] packedData) {
    String unpackedData = "";

    final int negativeSign = 13;
    for (int currentCharIndex = 0; currentCharIndex < packedData.length; currentCharIndex++) {
        byte firstDigit = (byte) ((packedData[currentCharIndex] >>> 4) & 0x0F);
        byte secondDigit = (byte) (packedData[currentCharIndex] & 0x0F);
        unpackedData += String.valueOf(firstDigit);
        if (currentCharIndex == (packedData.length - 1)) {
            if (secondDigit == negativeSign) {
                unpackedData = "-" + unpackedData;
            }
        } else {
            unpackedData += String.valueOf(secondDigit);
        }
    }
    return unpackedData;
}
  • 3
    I wouldn't call it "encrypted" ;-) Have you tried e.g. https://sourceforge.net/projects/jrecord/? Maybe that solves your problem already without the need of reinventing the wheel. – Lothar Jun 20 '19 at 18:56
  • 3
    The data you posted doesn't look like valid COMP-3 to me. Has the file already undergone some EBCDIC->ASCII conversion? Then it is already broken beyond repair. You'll have to do all handling of COMP-3 fields before any codepage-conversion and taking into account the record-layout as defined inn the corresponding copybook. – piet.t Jun 21 '19 at 07:03
  • Is this question answered for you? If yes please mark whatever answer worked for you as "accepted" (and upvote any useful answers). If not clarify what is missing so we may could add to it. – Simon Sobisch Jun 29 '19 at 15:16

2 Answers2

3

...encrypted cobol file with comp-3 and other data ... not in human-readable format

Don't mix the two. An encrypted file could actually be human-readable (just not containing reasonable things) when the encryption happens by exchanging words. Not in human-readable format does not have anything to do with encryption (and I fail to see why this is tagged as spring).

Back to the original question:

How could I decode the entire file correctly, taking into consideration both comp-3 data and normal data?

You split the byte array into ranges "normal" (in your case I think you mean unpacked, very likely in a single-byte encoding) and "encoded" (packed) data.

Then convert the unpacked data to either String (effectively UTF-16) or to a numeric type and the packed data (did not checked if your comp-3 unpack is correct but looks fine at least not completely wrong [other then 13 may be not the only possible negative sign marker, but this depends on the data producing system]) to numeric types.

COBOL is record based, mostly fixed-length so the key to "decode the file" is to split the file into the records and fields (get the original COBOL record definition). In most cases you'd create a pojo with the same attributes as the COBOL definition and have a piece of code split the byte array into records and fields by positions, convert the pieces as needed and invoke your setters with the result. human-readable: an easy approach may be to generate the toString method...

Simon Sobisch
  • 6,263
  • 1
  • 18
  • 38
1

Editing the File

Any hex editor that supports EBCDIC should be able to display the file in a readabhle. Wikipedia suggests HxD, VEdit, UltraEdit, WinHex are hex editor's that support Ebcdic.

Another alternative is the recordEditor It can display the file with/ without a Cobol Copybook

enter image description here

JRecord

JRecord lets you read/write Mainframe using a Cobol Copybook. You can generate basic Java~JRecord Code in the recordEditor

See How do you generate java~jrecord code for a Cobol copybook

Bruce Martin
  • 10,358
  • 1
  • 27
  • 38